Difference between revisions of "Install and configure MPI"
(:) |
|||
Line 1: | Line 1: | ||
[[Category:Estel]] | [[Category:Estel]] | ||
+ | This article explains how to install and configure MPI to be able to run the '''ESTEL''' model in parallel on a network of computers. Note that this article is merely a quick run through the [http://www-unix.mcs.anl.gov/mpi/mpich2/downloads/mpich2-doc-install.pdf MPI Installer's Guide] at http://www-unix.mcs.anl.gov/mpi/mpich2/. | ||
− | + | = Pre-requesites= | |
− | + | == ssh key authentication == | |
− | = Pre- | + | To use MPI on a network of computers, you need to be able to log in any of the computer without user interaction (password etc...) This is easily achieved using secure shell key authentication. The methodology to setup ssh to use key authentication is described in the article entitled "[[Configure_ssh_for_MPI| Configure ssh for MPI]]". |
− | To use MPI on a network of computers, | + | == Fortran 90 compiler == |
+ | You need a Fortran90 compiler to compile and run the TELEMAC system. When running simulations in parallel mode, MPI use a wrapper to your existing compiler, usually called <code>mpif90</code>. This wrapper is built when MPI is compiled and therefore, you need to have a Fortran 90 compiler installed ''before'' you attempt to compile MPI. | ||
= Download MPI = | = Download MPI = | ||
+ | You can download from http://www-unix.mcs.anl.gov/mpi/mpich2/. You will end up with a gzipped tarball called something like <code>mpich2.tar.gz</code> with a version number in the name as well. Note that the TELEMAC system uses MPI-1 statements but we encourage you to install MPICH-2 (which is backward compatible) as TELEMAC will probably move towards MPI 2 at some point in the future. | ||
− | + | For the sake of this article, we assume that you have extracted the tarball into a directory called <code>/path/to/mpi-download/</code>. | |
− | |||
− | |||
= Compilation = | = Compilation = | ||
+ | The compilation of MPI is fairly straightforward but beforehand, you need to create an install folder (called <code>/path/to/mpi/</code> here) and a build folder (called <code>/path/to/mpi-build/</code> here): | ||
+ | <code><pre> | ||
+ | $ mkdir /path/to/mpi | ||
+ | $ mkdir /tmp/mpi-build | ||
+ | </pre></code> | ||
+ | |||
+ | To configure the MPI build, the only required step is to assign to the environment variable <code>F90</code> the name of your Fortran 90 compiler. This name needs to be in your <code>PATH</code> or you have to give the full path to the Fortran 90- compiler if not. Then the <code>configure</code> command will automatically configure the build for you: | ||
− | + | <code><pre> | |
− | tmp | + | cd /tmp/mpi-build |
− | export | + | export F90=f90compiler |
− | + | /path/to/mpi-download/configure --prefix=/path/to/mpi 2>&1 | tee configure.log | |
− | + | </pre></code> | |
− | + | This will test many factors on your machine and configure the build. If the <code>configure</code> command finished without problem, you are ready to build MPI. Note that you can inspect <code>configure.log</code> for problems. In particular, you want to make sure that the Fortran 90 compiler is OK for the build. | |
− | make | + | To compile and install MPI, just issue the standard <code>make</code> and <code>make install</code> commands: |
+ | <code><pre> | ||
+ | make | ||
make install | make install | ||
− | + | </pre></code> | |
− | = Configuration = | + | |
− | + | This will install MPI in <code>/path/tp/mpi/</code>. | |
+ | |||
+ | Note that you will need to install MPI on all the nodes in your network that will be used for MPI jobs. As they probably have the same computer architecture, you could just copy the <code>/path/to/mpi</code> accross. | ||
+ | |||
+ | = Configuration of MPI= | ||
+ | == PATH == | ||
+ | Add <code>/path/to/mpi/bin</code> to <code>PATH</code>. This often means adding the following lines to your <code>.bashrc</code> file: | ||
+ | <code><pre> | ||
+ | PATH=/path/to/mpi/bin:$PATH | ||
+ | export PATH | ||
+ | </pre></code> | ||
+ | |||
+ | In particular, you should be able to run the MPI Fortran 90 wrapper <code>mpif90</code> now: | ||
+ | |||
+ | <code><pre> | ||
+ | which mpif90 | ||
+ | /path/to/mpi/bin/mpif90 | ||
+ | </pre></code> | ||
+ | |||
+ | This needs to be done on each node in the network. This is straightforward if your home directory is shared for all the nodes or you need to do it manually othwerwise. | ||
+ | |||
+ | == .mpd.conf == | ||
+ | MPI requires a file in your home directory called <code>.mpd.conf</code> which contains the line: | ||
+ | <code><pre> | ||
+ | secretword=something_secret_but_don't_use_your_real_password | ||
+ | </pre></code> | ||
+ | |||
+ | where "<code>something</code>" is a random string. This file should be readable and writable only by you. | ||
+ | |||
+ | == mpd.hosts == | ||
+ | |||
+ | Create a file in your home directory called <code>mpd.hosts</code> which contain a list of the nodes to be used by MPI, one per line. This file should theoretically be created on the master node, i.e. the one that you will use to launch MPI jobs. | ||
+ | |||
+ | <code><pre> | ||
+ | master.full.domain | ||
+ | slave1.full.domain | ||
+ | slave2.full.domain | ||
+ | </pre></code> | ||
+ | |||
+ | = Using an MPI ring = | ||
− | + | == Start the ring == | |
+ | <code><pre> | ||
+ | master $ mpd -n 3 -f ~/mpd.hosts | ||
+ | </pre></code> | ||
− | + | Where N is the number of nodes to include in the ring. | |
− | + | == Test the ring == | |
+ | <code><pre> | ||
+ | master $ mpdtrace | ||
+ | master | ||
+ | slave1 | ||
+ | slave2 | ||
+ | </pre></code> | ||
− | + | <code><pre> | |
+ | master $ mpiexec -l -n 5 hostname | ||
+ | 2: slave1 | ||
+ | 1: master | ||
+ | 4: master | ||
+ | 0: slave2 | ||
+ | 3: slave2 | ||
+ | </pre></code> | ||
− | + | == Close the ring == | |
+ | The command <code>mpdallexit</code. is used to close the MPI ring: | ||
+ | <code><pre> | ||
+ | master $ mpdallexit | ||
+ | </pre></code> | ||
= Trouble shooting = | = Trouble shooting = | ||
/etc/hosts | /etc/hosts |
Revision as of 17:04, 21 August 2007
This article explains how to install and configure MPI to be able to run the ESTEL model in parallel on a network of computers. Note that this article is merely a quick run through the MPI Installer's Guide at http://www-unix.mcs.anl.gov/mpi/mpich2/.
Pre-requesites
ssh key authentication
To use MPI on a network of computers, you need to be able to log in any of the computer without user interaction (password etc...) This is easily achieved using secure shell key authentication. The methodology to setup ssh to use key authentication is described in the article entitled " Configure ssh for MPI".
Fortran 90 compiler
You need a Fortran90 compiler to compile and run the TELEMAC system. When running simulations in parallel mode, MPI use a wrapper to your existing compiler, usually called mpif90
. This wrapper is built when MPI is compiled and therefore, you need to have a Fortran 90 compiler installed before you attempt to compile MPI.
Download MPI
You can download from http://www-unix.mcs.anl.gov/mpi/mpich2/. You will end up with a gzipped tarball called something like mpich2.tar.gz
with a version number in the name as well. Note that the TELEMAC system uses MPI-1 statements but we encourage you to install MPICH-2 (which is backward compatible) as TELEMAC will probably move towards MPI 2 at some point in the future.
For the sake of this article, we assume that you have extracted the tarball into a directory called /path/to/mpi-download/
.
Compilation
The compilation of MPI is fairly straightforward but beforehand, you need to create an install folder (called /path/to/mpi/
here) and a build folder (called /path/to/mpi-build/
here):
$ mkdir /path/to/mpi
$ mkdir /tmp/mpi-build
To configure the MPI build, the only required step is to assign to the environment variable F90
the name of your Fortran 90 compiler. This name needs to be in your PATH
or you have to give the full path to the Fortran 90- compiler if not. Then the configure
command will automatically configure the build for you:
cd /tmp/mpi-build
export F90=f90compiler
/path/to/mpi-download/configure --prefix=/path/to/mpi 2>&1 | tee configure.log
This will test many factors on your machine and configure the build. If the configure
command finished without problem, you are ready to build MPI. Note that you can inspect configure.log
for problems. In particular, you want to make sure that the Fortran 90 compiler is OK for the build.
To compile and install MPI, just issue the standard make
and make install
commands:
make
make install
This will install MPI in /path/tp/mpi/
.
Note that you will need to install MPI on all the nodes in your network that will be used for MPI jobs. As they probably have the same computer architecture, you could just copy the /path/to/mpi
accross.
Configuration of MPI
PATH
Add /path/to/mpi/bin
to PATH
. This often means adding the following lines to your .bashrc
file:
PATH=/path/to/mpi/bin:$PATH
export PATH
In particular, you should be able to run the MPI Fortran 90 wrapper mpif90
now:
which mpif90
/path/to/mpi/bin/mpif90
This needs to be done on each node in the network. This is straightforward if your home directory is shared for all the nodes or you need to do it manually othwerwise.
.mpd.conf
MPI requires a file in your home directory called .mpd.conf
which contains the line:
secretword=something_secret_but_don't_use_your_real_password
where "something
" is a random string. This file should be readable and writable only by you.
mpd.hosts
Create a file in your home directory called mpd.hosts
which contain a list of the nodes to be used by MPI, one per line. This file should theoretically be created on the master node, i.e. the one that you will use to launch MPI jobs.
master.full.domain
slave1.full.domain
slave2.full.domain
Using an MPI ring
Start the ring
master $ mpd -n 3 -f ~/mpd.hosts
Where N is the number of nodes to include in the ring.
Test the ring
master $ mpdtrace
master
slave1
slave2
master $ mpiexec -l -n 5 hostname
2: slave1
1: master
4: master
0: slave2
3: slave2
Close the ring
The command mpdallexit</code. is used to close the MPI ring:
master $ mpdallexit
Trouble shooting
/etc/hosts