the PARMETIS Homepage
Parallel Basic Linear Algebra Subroutines (PBLAS) for distributed
memory MIMD computers. There are two libraries that contain PBLAS routines.
IBM provides PBLAS routines in their PESSL library, and the PBLAS
routines are also in the ScaLAPACK library libscalapack.a found in
/usr/apps/lib.
It is suggested that if you use the PBLAS in PESSL, that you use it
in conjunction with the PESSL ScaLAPACK and the BLACS in /usr/lib.
If you use ScaLAPACK from /usr/apps/lib, then you will automatically
get the PBLAS in the same library.
To compile and link your code with the PBLAS libraries, you must use the
following flags based on the language you used. For Fortran programs,
-L/usr/apps/lib -lscalapack -lblacsF77init -lblacs -lessl
For C programs use,
-L/usr/apps/lib -lscalapack -lblacsCinit -lblacs -lessl
For more information see
Parallel ESSL (PESSL) is a scalable mathematical subroutine library
that supports parallel processing applications running on the SP or
workstation clusters. Parallel ESSL supports the Single Program Multiple
Data (SPMD) programming model, where the programs running the parallel task
are identical. The tasks, however, work on different sets of data.
PESSL contains ScaLAPACK-like solvers and PBLAS routines. This library
is usually used in conjunction with IBM's BLACS library.
The PESSL routines are installed in /usr/lib.
Programs are usually compiled with the compiler flag -lpessl.
If you compile your code with the thread-safe compilers, then link with
the libpesslsmp.a PESSL library using the compiler flag
-lpesslsmp.
For the sequential library, see
ESSL.
There are several ESSL and PESSL resources:
PETSC, the Portable, Extensible Toolkit for Scientific computation,
provides sets of tools for the parallel (as well as serial), numerical solution
of PDEs that require solving large-scale, sparse nonlinear systems of
equations. PETSc includes nonlinear and linear equation solvers that employ a
variety of Newton techniques and Krylov subspace methods. PETSc provides
several parallel sparse matrix formats, including compressed row, block
compressed row, and block diagonal storage.
The PETSC library is installed on Cheetah at
/apps/PETSC and /usr/local/ACTS/PETSC.
Both version 2.1.5 and 2.1.6 are available. Unlike the
other software packages, the PETSC
installation retains the directory structure of the package. There are no
symbolic links to its archive and header files. To use the
package, you are required to specify the path of the files.
For more information see the following:
PVODE actually refers to a trio of closely related solvers:
- CVODE for systems of ordinary differential equations
- KINSOL for systems of nonlinear algebraic equations
- IDA for systems of differential-algebraic equations
PVODE is a solver for large systems of ordinary differential equations
on parallel machines. It contains methods for the solution of both stiff
and non-stiff initial value problems. The name of this package has
recently been changed to SUNDIALS ( SUite of Nonlinear and DIfferential/ALgebraic equation Solvers ).
On Cheetah the PVODE libraries are installed in
/apps/PVODE/.
For more information see the following:
ScaLAPACK is a library of high-performance linear algebra
routines for distributed-memory message-passing MIMD computers and
networks of workstations supporting PVM and/or MPI.
The freely available version of ScaLAPACK from Netlib is installed in
/usr/apps/lib. There are also ScaLAPACK type solvers in
PESSL.
To compile and link your code with the ScaLAPACK libraries, you must use
the following linker flags:
     -L/usr/apps/lib -lscalapack -lblacsF77init -lblacs -lessl
or
     -L/usr/apps/lib -lscalapack -lblacsCinit -lblacs -lessl
depending on whether you program in FORTRAN or in C.
See the
ScaLAPACK Home Page and
ACTS: ScaLAPACK for more information.
SuperLU is a general purpose library for the direct solution of
large, sparse, nonsymmetric systems of linear equations on high performance
machines.
The library routines will perform an LU decomposition with partial pivoting
and triangular system solves through forward and back substitution. The LU
factorization routines can handle non-square matrices but the triangular
solves are performed only for square matrices. The matrix columns may be
preordered (before factorization) either through library or user supplied
routines. This preordering for sparsity is completely separate from the
factorization. Working precision iterative refinement subroutines are
provided for improved backward stability. Routines are also provided to
equilibrate the system, estimate the condition number, calculate the
relative backward error, and estimate error bounds for the refined solutions.
SuperLU package comes in three different flavors:
- SuperLU for sequential machines
- SuperLU_MT for shared memory parallel machines
- SuperLU_DIST for distributed memory
On Cheetah the SuperLU library is installed in
/usr/local/ACTS/SUPERLU, and
also in /apps/SUPERLU/2.0/rs4_aix51_32/lib.
Both the serial, SUPERLU_2.0, and the distributed version,
SUPERLU_DIST_1.0, were installed. They have the same directory
structures as the other ACTS libraries. Example codes, make files,
and LoadLeveler batch scripts can be found under examples/.
For more information see the
SuperLU web page and
also
ACTS: SuperLU.
TurboMPI is a small library of collective functions from the MPI API
(such as broadcast, reduce, allreduce, barrier, alltoall) that are optimized
for running MPI applications on shared memory nodes.
The Turbo library will work with any MPI application. To use it, simply
link you MPI application with the Turbo library (-lturbo):
mpxlf90_r your_mpi_code.f -lturbo
Version 2.0.0 contains
mpi_barrier mpi_allreduce mpi_reduce
mpi_bcast mpi_alltoall mpi_alltoallv
There are several environment variables allowing to control Turbompi behavior:
MPJ_SHM_SIZE size in MBytes of the shared segment on each node (default 160)
MPJ_MTAB max number of communicators (default 20)
MPJ_MPPN number of processors per node (default 32)
MPJ_MNODE number of nodes (default 64)
You can disable a routine by setting to 0 (1 for active) the following
variables:
MPJ_BARRIER (default = 1)
MPJ_ALLREDUCE (default = 1)
MPJ_REDUCE (default = 1)
MPJ_BCAST (default = 1)
MPJ_ALLTOALL (default = 0) 1:development version, 2:turbompi tested version
MPJ_ALLTOALLV (default = 0)
MPJ_INFO=1 (default = 0) give information about Turbompi
The TurboMPI library is installed in /usr/apps/lib/libturbo.a. You
can find examples in the /apps/TurboMP/2.0.0/Examples directory.
For more information see the
IBM TurboMP web page.
TurboSHMEM is a fairly complete implementation of the SHMEM API
popularized by Cray Research for their T3D/E systems. The implementation here
uses IBM Low-Level API (LAPI) technology to obtain optimized one-sided
communication for the put/get operations. This allows applications already
written with the SHMEM API to run on IBM platforms with minimal source code
changes.
Due to the evolution of LAPI, not to mention IBM POWER hardware over the
last couple of years, some design decisions were made to simplify
the maintenance of this library. This library is tuned to the current
POWER4 hardware and its supporting software (PE 3.2). Although not
optimal for POWER3 systems, it runs reasonably well on those systems,
but is not "feature complete".
Limitations are as follows:
POWER4 (PE 3.2)
---------------
(1) 64-bit supported.
(2) Shared memory supported.
(3) Vector data movement supported.
(4) SP Switch supported.
TurboSHMEM is already enabled to use TurboMPI, which
optimizes the collective functions on shared memory nodes. There is no need to
add the -lturbo library. However, for environment variables associated with
the turbo library, please see the TurboMPI section.
To use TurboSHMEM, use thread-safe compile invocation (e.g., _r versions).
For example compile with: mpxlf_r -q64 myprogram.f -L/usr/apps/lib -lsmaf .
Use libsmaf for Fortran and libsmac for C.
Then, in order to run the code, some environment variables must be set.
POWER4:
MP_MSG_API=mpi,lapi
MP_SHARED_MEMORY=yes
MP_EUILIB=us
MP_RESD=no
LAPI_USE_SHM=yes
*** NOTE: LAPI cross memory kernel extension must be loaded in order to use
LAPI_USE_SHM environment variable. See the PSSP 3.5 admin manual for
details.
To execute the program use "poe a.out" just like with any other parallel program.
The TurboSHMEM libraries are installed in /usr/apps/lib. You
can find examples in the /apps/TurboMP/2.0.0/Examples directory.
There are several caveats when using TurboSHMEM. - please consult the
README.TurboSHMEM file in /apps/TurboMP/2.0.0.
For more information see the
IBM TurboMP web page.
dbx --- the IBM debugger for serial programs.
Installed in /usr/bin.
See
IBM Parallel Environment for AIX: Debugging and Visualizing
for more information on debugging.
Ensight is a tool for all types of engineering analysis, visualisation
and communication.
On Cheetah, Ensight can be found in /usr/local/bin/ensight
and /usr/local/bin/ensight74.
For more information see the
Ensight web page.
The GNU Image Manipulation Program gimp is a freely distributed piece of
software for tasks like photo retouching and image composition and authoring.
It can be used as a simple paint program, an expert quality photo retouching
program, an online batch processing system, a mass production image renderer,
an image format converter, etc. It has most of the functionality of Photoshop
but without the cost. GIMP also comes with extensive documentation for both
users and programmers, including a manual, tutorials, examples of various
features, links to other GIMP-related sites, and an extensive list of plug-ins.
On Cheetah gimp can be found in /usr/local/bin/gimp.
For more information see The Gimp homepage.
gnuplot is a command-driven interactive function plotting program.
It can be used to plot functions and data points in both two- and
three-dimensional plots in many different formats.
On Cheetah gnuplot can be found in
/usr/local/bin/gnuplot.
For more information see the Gnuplot
web page.
grace
Grace is a WYSIWYG 2D plotting tool. Its capabilities are roughly similar to
GUI-based programs like Sigmaplot or Microcal Origin plus script-based tools
like Gnuplot or Genplot. Its strength lies in the fact that it combines the
convenience of a graphical user interface with the power of a scripting
language which enables it to do sophisticated calculations or perform automated
tasks. Grace is a descendant of ACE/gr, also known as Xmgr.
On Cheetah the Grace tools are installed in /apps/grace
but they can be reached from /usr/local/bin/, with the
commands convcal, fdf2fit, gracebat, grconvert, and xmgrace.
For more information see the
Grace web page.
The Grid Analysis and Display System (GrADS) is an interactive
desktop tool that is used for easy access, manipulation, and visualization
of earth science data. The format of the data may be either binary, GRIB,
NetCDF, or HDF-SDS (Scientific Data Sets). GrADS has been implemented
worldwide on a variety of commonly used operating systems and is freely
distributed over the Internet.
The GrADS executables available in /usr/local/bin are the following:
- gradsc
- gradsdods
- gradshdf
- gradsnc
- gribmap
- gribscan
- gxeps
- gxps
- gxtran
- stnmap
- wgrib
See the
GRaDS online documentation
for more information. Note that a short tutorial is available in
/usr/local/apps/grads/1.8/example.
The Hardware Performance Monitor (HPM) toolkit is suite of tools
for monitoring the hardware performance counters during execution of a
program. The HPM toolkit consists of three parts:
- hpmcount (in /usr/apps/bin/hpmcount)
- libhpm.a (in /usr/apps/include/libhpm.h)
- hpmviz (in /usr/apps/bin/hpmviz)
hpmcount is a simple-to-use standalone utility that starts an
application and provides summary utilization data for the entire run.
libhpm, rather, is an interface that can be used to obtain utilization
statistics for certain regions of code. The libhpm interface stores output
in two files: one that is a plain text file that looks similar to the hpmcount
output and the other which is designed to be visualized by the hpmviz utility.
See HPM Overview and the
Hardware Performance Monitor (HPM) Toolkit
for more information. Also, there are examples in
/apps/HPM/prod/rs4_aix51/doc/examples.
NCAR Graphics: the libraries are installed in
/opt/public/lib (libncarg.a, libncarg_c.a, libncarg_gks.a,
libncarg_ras.a, and libnco.a). More of this software can be found in
/apps/ncarg+ncl/prod/rs_aix51_64/lib.
See NCAR Graphics Homepage
for additional information.
NetCDF Viewer (NCView) is a visual browser for netCDF format files.
Typically you would use ncview to get a quick and easy, push-button
look at your netCDF files. You can view simple movies of the data,
view along various dimensions, take a look at the actual data values,
change color maps, invert the data, etc.
The ncview executable is in /usr/local/bin. You can
also type man ncview for details.
For more information see the
NCView home page.
Open Visualization Data Explorer (OpenDX) is an application and
development software package for visualizing data, especially 3D data from
simulations or acquired from observations. It uses a Graphical User Interface
based on X windows and Motif. It comes with a complete set of standard
visualization tools for looking at data. These tools include cutting planes,
vector line traces, volume rendering, and isosurface/isocontour tools.
On Cheetah OpenDX can be found at /usr/local/bin/dx.
For more information see the OpenDX web page.
pdbx is a command-line parallel debugger built on dbx.
On Cheetah is is found at /usr/bin/pdbx.
See
IBM Parallel Environment for AIX: Debugging and Visualizing
or
"man pdbx" for more information on debugging.
TotalView is a source-level, X-windows GUI debugger for single-
and multi-process programs. It works with Fortran and Fortran 90, C,
and C++. It can be used for parallel programs that run under POE with
hand-coded MPI calls. It also has facilities for multi-process
thread-based parallel programs such as OpenMP.
The Totalview parallel debugger is located at
/apps/toolworks/totalview. Version 6.3.0 of the executable
is located at /apps/toolworks/totalview/bin/totalview.
The following are online Totalview resources:
xmgr is a 2-D plotting tool for numerical data. The last public
release was 4.1.2. The xmgrace tool
supersedes it.
On Cheetah, xmgr can be found at /usr/local/bin/xmgr.
xmgrace is a WYSIWYG 2-D plotting tool for numerical data.
On Cheetah it is found at /usr/local/bin/xmgrace.
For more information, see the
Grace home page.
The xprofiler tool is an extended graphical user interface profiler.
To use the xprofiler tool, the source code has to be compiled with
the -pg option.
Then use xprofiler after the execution of the compiled program.
xprofiler is installed in /usr/bin/xprofiler.
For more information, see "man xprofiler".
prof and
gprof --- the IBM serial and parallel execution
profilers installed in the
/usr/bin
See "man prof", and "man gprof" for details.
All Gnu libraries are installed in /usr/local/gnu/,
but have links in /usr/local/lib so that the actual location
does not need to be known.
DejaGnu is a framework for testing other programs. Its purpose
is to provide a single front end for all tests. Think of it as a
custom library of Tcl procedures crafted to support writing a
test harness. A Test Harness is the testing infrastructure that
is created to support a specific program or tool. Each program
can have multiple testsuites, all supported by a single test
harness. DejaGnu is written in Expect, which in turn uses Tcl --
Tool command language.
For more information see the
Dejagnu Web Page.
A group of utilities that displays difference between and among
text files. 'diff' outputs the difference between two files
producing no ouput if the files are identical. If the file are
binary (non-text) it reports only that they are different.
'cmp' shows the offsets and file numbers where two files differ;
it can also show, all the characters that differ between the
two files. 'sdiff' merges two files interactively. 'diff3' shows
differences among three files.
For more information see the
Diffutiles Web Page.
GSL, the GNU Scientific Library is a numerical library
for C and C++ programmers. The library provides a wide range of
mathematical routines such as random number generators,
special functions and least-squares fitting. There are over
1000 functions in total.
On Cheetah the GSL libs have been installed in
/apps/GSL/prod/rs4_aix51_32/lib and are called
libgsl.a, libgsl.la, libgslcblas.a, and libgslcblas.la.
The GSL executables are found in
/apps/GSL/prod/rs4_aix51_32/bin and are called
gsl-config, gsl-histogram, and gsl-randist.
Finally, the GSL include files can be found in
/apps/GSL/prod/rs4_aix51_32/include/gsl, or also in
/usr/local/include/gsl.
For more information see the
GSL homepage.
jpeg
(pronounced "jay-peg") is a standardized image compression mechanism.
JPEG stands for Joint Photographic Experts Group, the original name of the
committee that wrote the standard.
JPEG is designed for compressing either full-color or gray-scale images
of natural, real-world scenes. It works well on photographs, naturalistic
artwork, and similar material; not so well on lettering, simple cartoons,
or line drawings. JPEG handles only still images.
The libjpeg.a library is available in /usr/local/lib.
For more information see the
Independent JPEG Group Web Page.
The Portable Application Code Toolkit (PACT) is a toolkit than can be used
to create portable applications. PACT is LLNL environment of choice for
handling unique data, portable computing environments, and for converting
data for visualization.
The PACT toolkit is only available in /apps/pact.
For more information see the
PACT Web Page.
The Portable Network Graphics (PNG) format provides a portable,
well-compressed, well-specified standard for lossless bitmapped image files.
Although the initial motivation for developing PNG was to replace GIF
(Graphics Interchange Format), the design provides some useful new features
not available in GIF, with minimal cost to developers. It has two major
uses: the World Wide Web (WWW) and image-editing.
For more information see the
PNG Web Page.
sprng provides both FORTRAN and C (also C++) interfaces for the use of
the parallel random number generators. The current version has all the
generators in one library. A complete documentation is not available
yet. Please consult the old documentations and the interface at
SPRNG home.
The GNU Text Utilities are the basic text-manipulation utilities of
the GNU operating system.
For more information see the
Textutiles Web Page.
ungif is a library for reading and writing gif images.
The save functionality uses an uncompressed gif algorithm to avoid
the Unisys LZW patent. This library is based heavily on Eric Raymond's
libgif package and implements a superset of that_library's API.
For more information see the
libungif Web Page.
zlib is a free, general-purpose, lossless data-compression library
for use on virtually any computer hardware and operating system. The zlib
data format is itself portable across platforms. Unlike the LZW compression
method used in Unix compress(1) and in the GIF image format, the compression
method currently used in zlib essentially never expands the data. (LZW can
double or triple the file size in extreme cases.) zlib's memory footprint
is also independent of the input data and can be reduced, if necessary, at
some cost in compression.
The libz.a library is available in /usr/local/lib.
For more information see the
zlib Web Page.
NWChem is a high performance computational chemistry software. It
is designed to run on high-performance parallel supercomputers as well as
conventional workstation clusters. Due to the slow inter-node communication
that will exist until the Federation switch is installed, it is suggested
that NWChem be run within a 32-way node using the MPI version, which is
currently the default version. For multiple node jobs, the LAPI version
is also available, but the path must be specified. The LAPI version is
installed in /apps/NWChem/4.5/rs4_aix51_64_LAPI/bin/, and its data
files reside in /apps/NWChem/4.5/rs4_aix51_64_LAPI/data.
Each user of NWChem will need a .nwchemrc file in their home
directory that either is a copy of
/apps/NWChem/prod/data/default.nwchemrc
OR that points to this default.nwchemrc file. In order to point to it,
users would have to issue the following command prior to using NWChem:
"ln -s /apps/NWChem/prod/data/default.nwchemrc $HOME/.nwchemrc".
For more information, consult the following resources:
Hierarchical Storage Interface (hsi) is a friendly interface for the
users of HPSS. For more information on hsi, type "hsi help" on
one of the login nodes or click on the link below.
Other HSI resources
The LoadLeveler is a batch scheduling system available
through IBM for the SP, which provides the facility for building,
submitting and processing serial or parallel
(MPI and LAPI) batch jobs within a network of machines.
Other LoadLeveler Resources