Coarray Fortran (CAF) is a SPMD parallel programming model
based on a small set of language extensions to Fortran 90. CAF
supports access to non-local data using a natural extension to Fortran
90 syntax, lightweight and flexible synchronization primitives,
pointers, and dynamic allocation of shared data. An
executing CAF program consists of a static collection of asynchronous
process images. Like MPI programs, CAF programs explicitly manage
locality, data and computation distribution; however, CAF is a
shared-memory programming model based on one-sided
communication. Rather than explicitly coding message exchanges to
obtain off-processor data, CAF programs can directly reference
off-processor values using an extension of Fortran 90 syntax for
subscripted references. Since both remote data access and
synchronization are expressed in the language, communication and
synchronization are amenable to compiler-based optimizing
transformations.
To date, CAF has not appealed to application scientists as a model
for developing scalable, portable codes, because the language is still
somewhat immature and a compiler is only available on Cray platforms.
Recently, there has been a groundswell of interest in CAF as the Fortran 2008
standards committee has been working to extend Fortran with
coarrays to support parallel programming.
While we are pleased that the Fortran standards committee recognizes the
value of coarrays, we don't believe that the set of extensions agreed upon by
the committee are the right ones. The standards committee's
design choices were more
shaped more by the desire to introduce as few modifications to the
language as possible than to assemble the best set of extensions to
support parallel programming.
In our view, both Numrich and Reid's original design
and the coarray extensions proposed for Fortran 2008, suffer
from the following shortcomings:
- There is no support for processor subsets; for instance,
coarrays must be allocated over all images.
- Coarrays must be declared as global variables; one cannot
dynamically allocate a coarray into a locally scoped variable.
- The coarray extensions lack any notion of global pointers, which
are essential for creating and manipulating any kind of linked data
structure.
- Reliance on named critical sections for mutual exclusion hinders
scalable parallelism by associating mutual exclusion with code
regions rather than data objects.
- Fortran 2008's sync images statement doesn't provide a safe
synchronization space. As a result, synchronization operations in
user's code that are pending when a library call is made can
interfere with synchronization in the library call.
- There are no mechanisms to avoid or tolerate latency when
manipulating data on remote images.
- There is no support for collective communication.
To address these shortcomings, Rice University
is developing a clean-slate redesign of the Coarray
Fortran programming model.
Rice's new design for Coarray Fortran, which we call Coarray Fortran 2.0,
is an expressive set of coarray-based extensions to Fortran designed
to provide a productive parallel programming model. Compared to the
emerging Fortran 2008, Rice's new coarray-based language extensions
include some additional features:
- process subsets known as teams, which support coarrays,
collective communication, and relative indexing of process images
for pair-wise operations,
- topologies, which augment teams with a logical communication
structure,
- dynamic allocation/deallocation of coarrays and other shared
data,
- local variables within subroutines: declaration and allocation of
coarrays within the scope of a procedure is critical for library based-code,
- team-based coarray allocation and deallocation,
- global pointers in support of dynamic data structures, and
- enhanced support for synchronization for fine control over
program execution,
- safe and scalable support for mutual exclusion, including locks
and lock sets; and
- events, which provide a safe space for point-to-point
synchronization.
Rice's implementation of Coarray Fortran 2.0 is a work in progress.
We are working to create an open-source, portable, retargetable,
high-quality CAF 2.0 compiler suitable for use with production codes.
To achieve portability, our compiler performs a
source-to-source translation from CAF to Fortran 90 with calls to our
CAF 2.0 runtime library primitives. Our CAF compiler's generated code
can be compiled
by any Fortran 90 compiler that supports Cray pointers. To achieve
high performance, we generate Fortran 90 that is readily optimizable
by vendor compilers.
Our CAF 2.0 runtime library uses UC Berkeley's GASNet library as a substrate
for communication. GASNet's get and put operations are used to read
and write remote coarray elements. GASNet's active message support is
used to invoke operations on remote nodes. This capability is used to
form teams and to look up information about remote coarrays so that
process images can read and write them directly.
By popular demand,
a pre-alpha prototype CAF 2.0 compiler is available for download. This
prototype supports most of the new CAF 2.0 language features.
See our
release notes
for details about the current status of our prototype.
At present, our CAF 2.0 implementation is operational
on Linux Clusters and Cray XT systems. Support for Blue Gene systems is
planned.
We have recently begun experimentation with benchmarks to assess
the performance and scalability of our implementation.
Our performance page shows
early results.
Development of the Coarray Fortran 2.0 is supported by the Department
of Energy's Office of Science under cooperative agreements
DE-FC02-06ER25754 and DE-FC02-07ER25800. National Science Foundation grant
CNS 08-21727 provided partial support for a computational cluster used
in this research.
Project Team: