6810: 1094 Session 15

In this session, we'll try out some basic examples of parallel processing using MPI. Two versions of the test codes are available: one set using C++ classes in the main Session 15 directory and another set using functions in the function_versions subdirectory.

Introduction to Parallel Processing with MPI

This is a very brief introduction by example to using "Open MPI", which is an implementation of the "Message Passing Interface" or MPI (a library and software standard). We've set things up so we can use the computers in Smith 1094 running Linux as parallel processors. Communication between machines is carried out via ssh but we'll set it up so you don't need to enter passwords repeatedly. If your computer is not in Linux already, restart it now.

In a message passing implementation of parallel processing, multiple "tasks" are created, each with a unique identifying name, which run on different computers (multiple processes can run on one machine as well). The tasks talk to each other by sending and receiving "messages". A "message" is an array with elements that include data, identifiers for the source and destination processes, and a tag used to distinguish between messages. Usually a fixed set of processes is created when the program is initialized but the processes can be executing different programs (this is referred to as "Multiple Instruction Multiple Data" or MIMD).

There are many functions defined in the MPI library. We'll look at a few of the most important and heavily used functions here. See the documentation list below for more information.

  1. Setting up ssh. This procedure will let you login to all of the machines in 1094 without giving a password every time. (Skip any steps you've already done before.)
    1. Generate the files ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub with the command: ssh-keygen -t rsa (hit return at the "Enter file" request). When prompted, enter a passphrase (this is like a password, but generally longer; a good example is "physics6810rocks" :), which you will need to remember. Were id_rsa and id_rsa.pub created?

    2. Copy the contents of ~/.ssh/id_rsa.pub into the file ~/.ssh/authorized_keys (you will have to create this file if it does not already exist). Set the permissions with: chmod 600 authorized_keys
    3. Normally you would need to type ssh-agent $SHELL, but ssh-agent should already be running (check with ps aux | grep agent). If there is not a process called ssh-agent, then type ssh-agent $SHELL.
    4. Give the command ssh-add. You will be prompted (just this once!) for your passphrase. If successful, you will get an "Identity added:" message.
    5. Check that you can ssh without a password by connecting to at least four other computers (logging out each time; this only works from your original computer, where you typed ssh-add). The computer names in Smith 1094 start with "sm1094" and end with a, b, c, ..., r. The first time you ssh, you will be asked if you want to connect. Answer "yes" and then you won't be asked again. Did it work?

  2. Setting up the environment. You need to add "/usr/lib64/openmpi/bin" to your PATH (which is searched to find programs) and "/usr/lib64/openmpi/lib" to your LD_LIBRARY_PATH (which is searched to find libraries to link to). You can set these by separate commands in your shell scripts, but here we'll do it with the module program:
       module load openmpi-x86_64
    (give the command module avail to see all of the available modules). Check your shell with the command: printenv SHELL. If you are using tcsh as your shell (most of you), add the module command to your .cshrc.path file (create this file in your main directory if it doesn't exist). If you are using bash as your shell, put the module command in your .bashrc. To activate the module command the first time, type source .cshrc (or source .bashrc). It will be automatically sourced when you login (or ssh to another computer).
  3. MPI Hello, World! program.
  4. Send and Receive.

    In the program from the last section, there was no communication between the processes running in parallel. Here we use a program that demonstrates two of the "point-to-point communication" functions to send and receive messages (there are more functions!). The program doesn't do anything but send numbers back and forth, but can easily be generalized to more substantive tasks.

  5. Calculating pi in parallel.

    Here we explore a program to calculate pi in parallel using "collective communication" functions. This means that information is distributed to and collected from the processes all at once (rather than communicating with each one-to-one). [Note: The infinite series used here is not an efficient method to calculate pi!]

  6. More information on MPI.

6810: 1094 Session 15. Last modified: 07:15 am, April 08, 2013.