Fortran2
'Fortran2: Getting the most from Fortran'
Following On
Roll Call: Jonny, Lauren, Emma, SarahS, Rita
These notes follow on from the Fortran1 course. You may want to look back at those notes and examples, or if you're feeling confident, just dive in here!
To get the example programs login in to a Linux box and type:
svn co http://source.ggy.bris.ac.uk/subversion-open/fortran2/trunk fortran2
Allocatable Arrays
We'll begin by looking at allocatable arrys.
Previously, we declared the size of our arrays when we wrote our code. This is all well and good, and at least we knew where we stood. However, this exact specifiation of the shape of all things can become a painful limitation. Suppose you were writing a program to process a list of University employees. The number of employees will almost certainly vary from one month to the next. So, we can only know the number of records to read in at runtime. This sort of thing starts to crop up a lot when we aim to write more general-purpose, flexible & robust programs.
Fear not! however, as trusty Fortran90 is here to help. In Fortran90 we can write array declarations of the form:
<type>,allocatable,dimension(:) :: my1dAllocatableArray ! a 1D array of type <type>, which I will give a size to later
Let's go to an example:
cd fortran2/examples/example1
and take a look inside the file allocatable.f90.
In this example, we have a 2D grid and corresponding arrays of longitude and latitude coordinates, the sizes of which we will read from a namelist file at runtime. We will then allocate an appropriate amount of space, fill in the oordinate values according to the number of grid-cells specified in the file and report the whole lot to standard-out. Our arrays are declared with the allocatable attribute:
real,allocatable,dimension(:) :: x_coords ! x-coordinate values real,allocatable,dimension(:) :: y_coords ! y-coordinate values real,allocatable,dimension(:,:) :: data ! 2D data on grid
We've covered namelists before, so reading the values of the variables nx and ny is straighforward. Once we know these values we can request memory space for our arrays. Note that it is an excellent idea to check whether this request was satisfied. If it were not for some reason (if you request a huge amount of memory, your computer may not be able to oblige for example) we probably want the program to stop, lest bad things happen!
allocate(x_coords(nx),stat=errstat) if (errstat /= 0) then write (*,*) 'ERROR: could not allocate x_coords array' stop endif
Next is a spot of arrithmetic to fill out the coordinate arrays properly. We want cell-centre coordinates, with longitude running from 0 to 360 degrees and lattitude from -90 to +90 degrees.
So we're done? Almost, but not quite! It is vital that once you're finished with it, you clear up any memory that you allocated earlier. If you don't, bad things are very likely to happen. So the last few lines look like:
if(allocated(data)) deallocate(data) if(allocated(x_coords)) deallocate(x_coords) if(allocated(y_coords)) deallocate(y_coords)
Here, we're a bit smarter than the average bear, and we've checked to see if an array has indeed been allocated before deallocating it. It would be unwise to try to deallocate something which had not been allocated in the first place.
So there we are. Sorted. Allocatable arrays. Try varying the values in input.nml first, and then procede to writing a few allocatable arrays of your own.
User Derived Types
Next up, let's look at another very useful feature provided by good old Fortran90---user-derived types:
cd ../example2
Way back at the start of Fortran1 we looked at the half dozen or so intrinsic types offered by Fortran90. It's not so much that this set of types is becoming a limitation (as we found with static array sizes) but more that for a large program things get a bit, well, messy! "Not the end of the world", you may say. And you're right. However, mess does lead to bugs and bugs are a huge problem, so actually messy, spaghetti-like code, is a significant problem that we want to avoid at all costs.
Now, user-derived types offer us something simple, yet potentially extremely powerful--we can group things which are related into a single composite type. That is, we can group all the things that belong together into a single entity and then pass that entity around as work is done by our program. This is a huge bonus when designing and maintaining a reasonably sized program. As we'll find out with growing experience, user-derived types are, pretty much, the best thing since sliced bread!
User-derived types represent a first step down the road towards object-oriented programming. A step in the right direction in terms of designing programs.
OK, enough of the spiel. Let's take a look at user-types.f90.
I recently bought a DAB radio and I'm as pleased as punch with it. One of the great features is that you can select a radio station by name, push a button and bingo!, you're tuned in. No more scrolling around the dial trying to get clear reception. That makes it really easy to 'station hop' and find something good that's on. Now, imagaine we were designing a program to install into a DAB radio. It's natural that we would want to keep the radio station name and tuning frequency (still needed, of course, but only by the radio itself as it now does the tuning for you) together in a single bundle. They belong together, after all, and are of different intrinsic types--a character string and a real number. In the program it looks like:
type station real :: frequency ! MHz character(len=maxstr) :: name ! str end type station
It wouldn't be much of a radio unless it had a number of stations. We can use an (allocatable) array of entities of our new derived type to store all the required station information:
type(station),allocatable,dimension(:) :: stations ! array of stations
I've made the array allocatable for flexibility, or perhaps out of habit. I guess we could get more sophisticated and grow the array should more stations became availble. Which does indeed happen..
After allocating the array (and importantly checking to see whether we hit a problem in doing so), all I need to do is fill out the array and print it to the screen. Note the use of % to access elements of a derived-type.
We can pass derived-types around, into and out of subroutines for example, just like intrinsic types. This enables us to write very elegant, compartmentalised and hence maintainable programs. If we discover that we need some more information added to our radio station type, we just add it in one place and the rest of the program can remain unchanged--very handy!
Have a go now at writing a subroutine which takes the radio station array as an argument. Then add something new to the derived type and access it in your subroutine. Note that you can use the syntax:
type(station),dimension(:) :: stations
to specify an array argument of unknown length. You can use the size and shape operators inside the subroutine to find out the length of the array in due course.
Modules
cd ../exampe3
In this example, we'll take a look at another very useful feature provided by Fortran90--modules. These provide further mileage down the road toward Object Oriented programming. They also do away with the need for common blocks, as used by older incarnations of the language.
Put simply, modules provide a way to collect together related variables and routines into a single entity. Other parts of your program may gain access to this entity through the use statement. In this way we see how the spaghetti tendency of large programs can be tamed, and how the slithery mess of pasta noodles starts to become a manageable collection of components that we plug together in a style more reminiscent of lego or mechano.
Modules also have other useful side-effects. For example, we get compiler interface (argument list) checks between your subroutine calls and definitions for free. Not to be sneezed at when a bug causes your program to suddenly die for some inexplicable reason. An argument list mismatch is a common culprit for this.
Let's take a look at our example module in mymod.f90. In this example we see the accumulation of other features of Fortran90 shown in previous examples. For example, we see allocatable arrays, user-derived types, reading data from namelists, checks for memory allocation and deallocation, paramter attributes on 'global' constants etc. etc.
On the first line of the file we see:
module mymod
This is key to creating a module and differs from the program prog we see on the first line of prog.f90. There is, of course, a corresponding end statement on the last line of the file.
We have the ubiquitious implicit none and then the declaration of any variables we wish place in the module. We have a constant named pi, which can be made available to any other part of the program which connects to this module. We could also make available non-constant variables in this way too, if we really had to!. This is how modules replace the sharing of variables via common blocks. Sharing 'global variables' in this way is worth avoiding, however, if you possibly can, as it is another common source of bugs and encourages spaghetti coding.
Next up we have a type declaration for the rather simple 'model' that we'll be using. This user-derived type encapsulates the internal model state together with a record of how many timesteps have been asked of the model. We need an instance of this type to work with and this is declared by:
type(simple_model),private :: modelA
Note the new attribute, private. This means that this instance will not be made available to other parts of the program which connect to this module. This is a good example of defensive programming through data hiding. It simplifies the interface that the rest of the program has to our model and also guards against bugs which could creep in through the inadvertant modification of our simple model's internal state. Trust me, privacy is a good thing and well worth getting into the habit of specifying. It also prompts one to think about the design of our program a bit more, which is always a good thing.
Following on from our variable declaration, we have:
contains
This signals the end of the variables and the start of the defnitions of any routines (subroutines or functions) that we've included in our module. In this case we have three subroutines for initialising, timestepping and finalising our model, respectively. The routines contain code constructs that we have seen in previous examples. It's worth studying the error checks, use of external files, memory allocation and deallocation and the syntax for accessing elements of a derived type, just as a refresher.
OK, so much for the module, now let's look at a program which uses it---prog.f90.
This is a rather simple program, now that much of the action is hived-off into a module. Note that the very first line of the program, before even our beloved implicit none is:
use mymod, only: pi, initialise_model, timestep_model, finalise_model
Here we state that we will use the named module in this program (we could use a number of modules). Some additional defensive programming is that we only include the things we need from that module. This simplifies our life. We then procede to make use of any imported varaibles and routines as we see fit. Note, however, that we do not need direct access to the hidden internal state of our simple model.
OK, now that you are familiar with the vital anatomy of Fortran90 modules, have a play around. Start by changing the start-up state of the model by editing imput.nml. You could then proceed, for example, by adding new varaibles or subroutines to the module and making use of them in the program. When you're happy with that, you could experiment with the only and private keywords or even create a whole new module. Go on! Why not?:)
Using separate files
In the Fortran practicals so far, all compilations steps have been taken care of by the powerful build system program called make. However, we are reaching the time where you need to get your hands dirty and see how compilation can be done "by hand". Then we will see how the Fortran program can be split over several files.
$ cd ../example4
Invoking the compiler
The compiler that you have been using during the Fortran practicals is called g95. It is a free open source compiler. A range of compilers is available on the Linux platform. gfortran is another free software alternative. ifort and pgf90 are commercial offerings from Intel and the Portland Group respectively. In this section, the g95 compiler will be used but note that the practical would work just as well with any of these compilers.
To compile a Fortran file with g95, simply invoke g95 with the name of the file. It will then create a file called a.out which is executable on the machine you compiled it on.
$ cd hello_world $ g95 hello_world.f90 $ ls-rtl total 420 -rw-r--r-- 1 ggjpr uucp 93 Mar 3 20:39 hello_world.f90 -rwxr-xr-x 1 ggjpr uucp 411750 Mar 3 20:41 a.out
Try to execute the file a.out. Also not the size difference between the source file and the executable. Source files take up a lot less room.
The compiler could be invoked with many options to check differents aspects of the fortran syntax, to control what types of warnings to issue, to perform various optimisations of the code or to create debugging information. We will not go into details in this practical so refer to the compiler documentation if you want more information.
If you modify the source file, you need to recompile this file it before executing a.out for your changes to be taken into account.
The first compiler option we will be using is used to change the name of the executable file. This is done with the option -o followed by the name required:
$ g95 hello_world.f90 -o myprog
Now that we can invoke the compiler manually, let's see how to split a Fortran program into multiple files.
Creating objects and linking them
First of all, let's see how to put a subroutine in a separate file. In the subroutine folder you will find a basic Fortran program in the file onefile.f90. It contains a main program unit and a subroutine.
$ cd ../subroutine
You can compile it using the method above.
$ g95 onefile.f90 -o myprog $ ./myprog
The content of this small Fortran program has also been split into two separate files:
- one for the main program unit (main_program.f90)
- one the subroutine (sub1.f90).
If you invoke the compiler on either of these files, it will complain that something is missing. Try it to see the two types of error messages.
The reason for this is that (1) the main program needs the definition of the subroutine and (2) a Fortran program need a main program unit so neither file is enough on its own. Luckily, you can also invoke the compiler on the two files at the same time.
$ g95 main_program.f90 sub1.f90 -o myprog
It works! (and note that the order of the files does not matter). By giving two files as arguments to the compiler, it is clever enough to understand that there might be dependencies between the files and it takes care of them automatically.
The problem with this approach is that if the subroutine is modified, the main program will also have to be recompiled to generate the executable. This would be a issue for larger Fortran programs because it would be time consuming. We will also see later on that we might want to included external subroutines contained in a library and to which the source code is not available.
Instead, it is possible to ask the compiler to compile a file and "ignore" the dependencies. By doing this, the compiler creates an object file (usual extension .o) which contains all the information needed to create an executable minus the actual dependencies. This is done by using option -c for "compile only".
$ g95 main_program.f90 -c $ g95 sub1.f90 -c
Check that two new files have been created. By default, they should be named like the source file but with a .o extension. Again, this default behaviour could be changed with the option -o.
Then the two objects have to be linked together to create an executable file. This is done by invoking a the compiler on the object files:
$ g95 sub1.0 main_program.o -o myprog
This might seem like an extra step compared to the previous method but the advantage of this method is that you can modify one file independently of the other, re-create the object for the modified file only and then link the objects together. If you have many files in your project, that would save a significant amount of time.
Try to do it,
Note 1: Although you can modify file independently, it still needs to make sense in term of Fortran. For instance, if you add an argument to the subroutine but don't modify the main program unit, the comiler will complain at the linking stage... Note 2: if you have many files in your project, understanding which ones need to be recompiled can be a complex task. This is why you are encourage to use a "build system". The course contains an introduction to the make build system later on.
Libraries
Will do this too - JPR
Appendices
NetCDF is fast becoming the de facto file format for model data. It is a well-designed format for storing your own data and also a convenient way to share data with your collaborators. One of the it's many benefits is that the file format is endian-independent. That means that we don't need to worry, for example, about which architecture (linux, windows, mac etc.) the data was created on
This section ontains a number of data-structures (derived types) and routines for reading and writing NetCDF files. They are bound up into several examples. I believe that these will be useful to adopt in your own programs, so feel fee to copy and use them.
To go further
That's it for the Fortran tuition. The next units will be about project management using make, debugging and version control.