Coarray Fortran 2.0 at Rice University

Coarray Fortran (CAF) is a SPMD parallel programming model based on a small set of language extensions to Fortran 90. CAF supports access to non-local data using a natural extension to Fortran 90 syntax, lightweight and flexible synchronization primitives, pointers, and dynamic allocation of shared data. An executing CAF program consists of a static collection of asynchronous process images. Like MPI programs, CAF programs explicitly manage locality, data and computation distribution; however, CAF is a shared-memory programming model based on one-sided communication. Rather than explicitly coding message exchanges to obtain off-processor data, CAF programs can directly reference off-processor values using an extension of Fortran 90 syntax for subscripted references. Since both remote data access and synchronization are expressed in the language, communication and synchronization are amenable to compiler-based optimizing transformations.

To date, CAF has not appealed to application scientists as a model for developing scalable, portable codes, because the language is still somewhat immature and a compiler is only available on Cray platforms. Recently, there has been a groundswell of interest in CAF as the Fortran 2008 standards committee has been working to extend Fortran with coarrays to support parallel programming. While we are pleased that the Fortran standards committee recognizes the value of coarrays, we don't believe that the set of extensions agreed upon by the committee are the right ones. The standards committee's design choices were more shaped more by the desire to introduce as few modifications to the language as possible than to assemble the best set of extensions to support parallel programming. In our view, both Numrich and Reid's original design and the coarray extensions proposed for Fortran 2008, suffer from the following shortcomings:

There is no support for processor subsets; for instance, coarrays must be allocated over all images.
Coarrays must be declared as global variables; one cannot dynamically allocate a coarray into a locally scoped variable.
The coarray extensions lack any notion of global pointers, which are essential for creating and manipulating any kind of linked data structure.
Reliance on named critical sections for mutual exclusion hinders scalable parallelism by associating mutual exclusion with code regions rather than data objects.
Fortran 2008's sync images statement doesn't provide a safe synchronization space. As a result, synchronization operations in user's code that are pending when a library call is made can interfere with synchronization in the library call.
There are no mechanisms to avoid or tolerate latency when manipulating data on remote images.
There is no support for collective communication.

To address these shortcomings, Rice University is developing a clean-slate redesign of the Coarray Fortran programming model. Rice's new design for Coarray Fortran, which we call Coarray Fortran 2.0, is an expressive set of coarray-based extensions to Fortran designed to provide a productive parallel programming model. Compared to the emerging Fortran 2008, Rice's new coarray-based language extensions include some additional features:

process subsets known as teams, which support coarrays, collective communication, and relative indexing of process images for pair-wise operations,
topologies, which augment teams with a logical communication structure,
dynamic allocation/deallocation of coarrays and other shared data,
- local variables within subroutines: declaration and allocation of coarrays within the scope of a procedure is critical for library based-code,
team-based coarray allocation and deallocation,
global pointers in support of dynamic data structures, and
enhanced support for synchronization for fine control over program execution,
- safe and scalable support for mutual exclusion, including locks and lock sets; and
- events, which provide a safe space for point-to-point synchronization.

Rice's implementation of Coarray Fortran 2.0 is a work in progress. We are working to create an open-source, portable, retargetable, high-quality CAF 2.0 compiler suitable for use with production codes. To achieve portability, our compiler performs a source-to-source translation from CAF to Fortran 90 with calls to our CAF 2.0 runtime library primitives. Our CAF compiler's generated code can be compiled by any Fortran 90 compiler that supports Cray pointers. To achieve high performance, we generate Fortran 90 that is readily optimizable by vendor compilers. Our CAF 2.0 runtime library uses UC Berkeley's GASNet library as a substrate for communication. GASNet's get and put operations are used to read and write remote coarray elements. GASNet's active message support is used to invoke operations on remote nodes. This capability is used to form teams and to look up information about remote coarrays so that process images can read and write them directly.

By popular demand, a pre-alpha prototype CAF 2.0 compiler is available for download. This prototype supports most of the new CAF 2.0 language features. See our release notes for details about the current status of our prototype. At present, our CAF 2.0 implementation is operational on Linux Clusters and Cray XT systems. Support for Blue Gene systems is planned. We have recently begun experimentation with benchmarks to assess the performance and scalability of our implementation. Our performance page shows early results.

Development of the Coarray Fortran 2.0 is supported by the Department of Energy's Office of Science under cooperative agreements DE-FC02-06ER25754 and DE-FC02-07ER25800. National Science Foundation grant CNS 08-21727 provided partial support for a computational cluster used in this research.

Project Team: