780.20: 1094 Session 15

Handouts: "Three-Dimensional Plots with Gnuplot", "Using the GDB Debugger", printouts of eqheat.cpp, check_primes.cpp, and square_test.cpp

Today we'll look at a variety of small topics relevant for computational physics, particularly on Linux systems.

Your goals for this session:

3-D Plots with Gnuplot

We have frequently used Gnuplot for visualization, but we have only considered two-dimensional plots. Now we will want to make three-dimensional surface plots of functions and data.

  1. Follow through the handout on "Three-Dimensional Plots with Gnuplot".
  2. Figure out how to make a parametric plot of a sphere using trigonometric functions. To plot a 2-d circle, you would type:
      gnuplot> set parametric
      gnuplot> plot [0:2*pi] sin(t),cos(t)
    (hint: think spherical coordinates).
  3. Take a look at the eqheat.cpp code and guess at what it is doing. Compile and link it using make_eqheat and then run it to generate eqheat.dat. Look at eqheat.dat and then plot it with gnuplot, using the comments in the code and the handout as guides. Interpret the plot for your partner (and play with the output a bit). What equation is being solved? [You can take a look at laplace.cpp as well, which generates a similar looking plot for a different physical situation.]

Using the GDB Debugger

We'll step through a contrived example that illustrates the basic commands and capabilities of a debugger (in this case, the GNU debugger gdb). We will use the command-line ("no windows") version of gdb. There are graphical interfaces (such as ddd) that are much nicer to use for more extensive debugging, but it will be worthwhile to start with the simple, primitive version.

  1. When debugging, you may find it convenient to have two terminal windows open: one to re-compile and link a sample code and another to run gdb in. (Actually, you can interact with gdb directly through emacs, but we won't go into that here.)
  2. Go through the example from the handout "Using the GDB Debugger". The code to debug is check_primes.cpp (a copy is also provided, called check_primes_orig.cpp, so that you can go back to the original if necessary). This C++ code uses C-style statements for the experience of seeing the extra bugs you can get away with in C. We've also created a makefile that doesn't use any of our usual warning flags (at first!).
  3. (BONUS) Try out the DDD interface to gdb by following through the sample_ddd_session.ps.gz handout included in session15.tarz (you will need to spend more time than this to learn how to use DDD efficiently!).

Squaring a Number

One of the most common floating-point operations is to square a number. Two ways to square x are: pow(x,2) and x*x. Which is more efficient? Is there an efficient alternative?

  1. Look at the printout for the square_test.cpp code. It implements these two ways of squaring a number. The "clock" function from time.h is used to find the elapsed time. Each operation is executed a large number of times (determined by "repeat") so that we get a reasonably accurate timing.
  2. Compile, link, and run the code (use make_square_test). Adjust "repeat" until the minimum time for each is at least 0.1 seconds. Which way to square x is more efficient?

  3. If you have an expression (rather than just x) to square, coding (expression)*(expression) is awkward and hard to read. Wouldn't it be better to call a function (e.g., squareit(expression)? Add to square_test.cpp a function:
    double squareit (double x)
    that returns x*x. Add a section to the code that times how long this takes (just copy one of the other timing sections and edit it appropriately). How does it compare to the others? What is the "overhead" in calling a function (that is, how much extra time does it take)? When is the overhead worthwhile?

  4. Another alternative, common from C programming: use #define to define a macro that squares a number. Add
    #define sqr(z) ((z)*(z))
    somewhere before the start of main. (The extra ()'s are safeguards against unexpected behavior; always include them!) Add a section to the code to time how long this macro takes; what do you find?

  5. One final alternative: add an "inline" function called square:
    inline double square (double x) { return (x*x); };
    that is a function prototype and the function itself. Put it up top with the squareit prototype. Add a section to the code to time how long this function takes. What is your conclusion about which of these methods to use? (Record the times for each method for comparison below to the Intel compiler.)

  6. Finally, we'll try the simplest way to optimize a code: let the compiler do it for you! Change the compile flag -O0 (no optimization) to -O3 in the CFLAGS line in make_square_test (that's the uppercase letter O, not a zero). Recompile, link, and run the code (note that $(MAKEFILE) was added to the line with square_test.o to make sure that the program is recompiled if the makefile is changed). How do the times for each operation compare to the times before you optimized? (Increase repeat to make all times above 0.1 seconds again.)

  7. In your project programs, once they are debugged and running, you'll want to use the -O3 optimization flag. Note that there are other options you can learn about using man g++.

Using the Intel C++ Compiler

It's very useful to have more than one compiler available. The Intel C++ compiler, which is called "icpc", is particularly good (assuming you are running on an Intel processor such as a Pentium 4).

  1. In order to access the Intel compiler and libraries, we need to set some environment variables. These will be installation dependent, but the same settings work for all of the physics machines. One way to do this is to set them in your .bashrc file. Instead, we'll take a shortcut and use the "module" program. Type the commands indicated in the following. Now we're ready to use the compiler.
  2. Try the compiler on the square_test program. Copy make_square_test to make_square_test_icpc and modify it as follows: Note that you can do all this by simply defining alternative variables, so that it is easy to switch back and forth. (Or else redefine the variables rather than deleting the intial definitions, so you can switch back simply by changing the order.)
  3. Run the program and compare to the unoptimized g++ results.

  4. Now let's try optimization. For icpc, the options -O2 -tpp7 -xW provide very good optimization. Try it! For more information on other compiler flags to consider, look at man icpc or icpc -help. There are also optimized libraries, such as mkl_lapack.

  5. For g++, the -march pentium4 compiler option (arch is for "architecture") performs optimization special for pentium 4.


A "profiling" tool, such as gprof, allows you to analyze how the execution time of your program is divided among various function calls. This information identifies the candidate sections of code for optimization. (You don't want to waste time optimizing a part of the code that is only active for 1% of the run time!) We'll use the eigen_basis.cpp code from an earlier session as a guinea pig.

  1. To use gprof, compile and link the relevant codes with the -pg option. You can do this most easily by editing make_eigen_basis and adding -pg to BOTH the CFLAGS and LDFLAGS lines. (Note that make_eigen_basis has $(MAKEFILE) added to several lines to ensure that the codes are recompiled if the makefile changes.)
  2. Execute the program as usual (choose a fairly large basis size so that it takes a while to execute, building up statistics), which generates a file called gmon.out that is used by gprof. The program has to exit normally (e.g., you can't stop it with control-c) and any existing gmon.out file will be overwritten.
  3. Run gprof and save the output to a file (e.g., gprof.out):
    gprof eigen_basis >! gprof.out
    Edit gprof.out and try to figure out from the "Flat profile" and the explanations where (i.e., in what functions) the program spends the most time. Would you try to optimize the section that finds the eigenvalues?

  4. Try profiling the square_test code. You might like to know in this case how much time each line uses, rather than each function. Try (after recompiling with -pg) the -l option:
    gprof -l square_test >! gprof.out
    Are the results consistent with the timings from the program?

780.20: 1094 Session 15. Last modified: 08:46 am, March 06, 2006.