Log In | New Account   
Home My Page ESPResSo++
Project Home Summary Activity Forums Lists Tasks Docs News SCM Mediawiki

Assorted information

From espressopp

Jump to: navigation, search

This page lists assorted information that might be interesting to ESPResSo++.

Contents

Hardware

This section contains informations on specific hardware and its limitations/benefits.

POWER6

The IBM POWER6 processor is a traditional general purpose CPU. However, its pipeline is designed in-order so the compiler is responsible for preventing pipeline stalls. This means mainly one thing: if-statements are catastrophic, and need to be avoided in all critical loops.

For MD, this commonly occurs in the force/energy loop, where one needs to check for the cutoff radius:

foreach(Pair pair, verletlist) {
  float_t d = dist(pair[1].pos, pair[2].pos);
  if (d < cutoff) {
    energy += eps*pow(d,-4);
  }
}

The construct above is about 2.5x slower than a split-up version (below):

float_t d[verletlist.size()];
size_t pairsize = 0;
foreach(Pair pair, verletlist) {
  float_t d_tmp = dist(pair[1].pos, pair[2].pos);
  if (d_tmp < cutoff) {
    d[pairsize++] = d_tmp;
  }
}
for(size_t i = 0; i < pairsize; ++i) {
  energy += eps*pow(d[i],-4);
}

In the split-up version the compiler is able to eliminate the branch in the first loop, and therefore both loops are branch-free. This problem to some extent also concerns recent Intel/AMD processors, since they are not very good at branch prediction and suffer large penalties for wrong guesses.

A second "feature" of the POWER6 is that it lacks store-forwarding. That means, that a value that is written to the cache cannot be accessed again until it actually has been written, causing a pipeline stall. Since there is no direct FPU -> integer unit path, all floating point to integer conversions go via the cache, and cause a stall. Therefore, one should avoid floating point to integer conversions as much as possible.

BlueGene

The BlueGene is basically a homogeneous parallel computer with thousands of relatively weak processors. Each processor has 4 cores, which deliver at most 5 GFlops per core, compared to about 10-20 GFlops that a current Intel/AMD cpu delivers. The internal network topology is a simple 3d-torus, i.e. data is transferred in hops; the BlueGene design is therefore best-suited for local applications (such as MD).

The problems with Tcl come from the fact, that on BlueGene, I/O is provided indirectly through a few, user-inaccessible I/O nodes. This means that every I/O operation including printing to stdout causes network traffic. Moreover, not all Linux-I/O is looped-through; some system calls are not implemented. Therefore, it is in general not possible to use libraries that perform I/O without modification.

Other useful information:

  • python2.5 is by default installed on all compute nodes and works efficient. Apparently, there are quite some applications that use python for setup.
  • tcl is now also possible :-).
  • the remain problems with the old Espresso are indeed rather due to the fact that each particle is sent separately. Caching would solve this problem, which seems also to be the only bottleneck left in old Espresso.
  • the BlueGene guy has never heard of boost
  • there are already some automake macros to set up the cross-compiling for BlueGene

Design patterns

Scripted applications

How do other applications couple to scripts, in particular Python. Are there any examples of parallel applications driven by a scripting language?

Links

  • Design patterns for scripted applications: [1]
    OL: Some useful patterns here. Reflects the ideas we already had.

Parallel Programming/MPI and OOP

How do parallel and object-oriented programming work together? Are there any useful libraries?

Links

MPI 2

A few interesting facts from the MPI 2 standard[3] that may elevate the MPI usage in ESPResSo++ as opposed to ESPResSo:

  1. Chapter 4.1 "Portable MPI Process Startup": MPI 2 recommends implementors of MPI to provide an mpiexec program with standardized command line parameters. This allows us to write a single ESPResSo++-startup script that should work with most MPI environments, so that we do not have to look into various implementations.
  2. Chapter 4.2 "Passing NULL to MPI_Init": MPI_Init does not require the arguments argc and argv from the command line anymore, they can be set to NULL instead. Any MPI2 implementation must allow for this. Note, however, that the important MPI implementation MPICH is not compatible to MPI 2 in that respect.

Powered By GForge Collaborative Development Environment Contact us
Impressum (in German only)