≡ Menu

computer clusters

Tutorial: Linux MPI Parallel Clusters Programming

The MPI (Message Passing Interface) is a language-independent communications protocol used to program parallel computers. It allows many computers to communicate with one another. It is used in computer clusters.

The OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared memory multiprocessing programming in C/C++ and Fortran on many architectures, including Unix / Linux and Microsoft Windows platforms. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

This tutorial explains how to obtain, build, and use an MPI stack for Linux machines. This tutorial will take you from "hello world" to parallel matrix multiplication in a matter of minutes. The exercise takes slightly more than 30 minutes and allows one to develop and run MPI codes on a multi-core server or on a HPC cluster.

OpenMP has several strong points: it is a very simple system to use and it is widely available in compilers for most major platforms. There are however, other methods to express parallelism in your code. On distributed parallel systems, like Linux clusters, the Message Passing Interface (MPI) is widely used. MPI is not a programming language, but rather a standard library that is used to send messages between multiple processes. These processes can be located on the same system (a single multi-core SMP system) or on a collection of distributed servers. Unlike OpenMP, the distributed nature of MPI allows it to work on almost any parallel environment.

Does Linux support IP over Infiniband (IPoIB) multipathing and failover?

Parallel supercomputers and computer clusters needs to use failover and Infiniband multipathing to provide non stop computing. InfiniBand is a switched fabric communications link primarily used in high-performance computing. Its features include quality of service and failover, and it is designed to be scalable. The InfiniBand architecture specification defines a connection between processor nodes and high performance I/O nodes such as storage devices. It is a superset of the Virtual Interface Architecture.

InfiniBand theoretical throughput (speed)

InfiniBand provides high speed data transfer. For example USB 2.0 provides 480Mb/s data transfer or Gigagbit Ethernet support 1,000Mb/s, while 12X InfiniBand provides 96 Gbit/s in quad configuration. It also supports both copper and optical cabling. Following table shows effective theoretical throughput in different configurations:

1X2 Gbit/s4 Gbit/s8 Gbit/s
4X 8 Gbit/s16 Gbit/s32 Gbit/s
12X24 Gbit/s48 Gbit/s96 Gbit/s

The InfiniBand Project

The InfiniBand Architecture (IBA) is an industry standard that defines a new high-speed switched fabric subsystem designed to connect processor nodes and I/O nodes to form a system area network. This new interconnect method moves away from the local transaction-based I/O model across busses to a remote message-passing model across channels. The architecture is independent of the host operating system (OS) and the processor platform.

IBA provides both reliable and unreliable transport mechanisms in which messages are enqueued for delivery between end systems. Hardware transport protocols are defined that support reliable and unreliable messaging (send/receive), and memory manipulation semantics (e.g., RDMA read/write) without software intervention in the data transfer path.

Linux and infiniband support

Most enterprise Linux distribution (such as RHEL 4.5 / 5, CentOS / Novell Linux) has support for Infiniband (IPoIB), multipathing and failover. Linux kernel v2.6.11 and above has support for IPoIB and related technologies. The OpenFabrics Alliance is creating an open source software stack for InfiniBand and iWARP that includes the "IBVerbs" library.

I may get chance to play with infiniband based devices and Linux in near future :D