Friday, October 25, 2013

Passing Assumed Shape Arrays between Fortran and C++

Abstract

Fortran 90 introduced assumed shape arrays (colloquially referred to as "Fortran 90 Arrays"), a big step forward.  Fortran 2003 introduced interoperability with C/C++, another step forward.  Unfortunately, there is no well-known way to pass assumed shape arrays between Fortran and C++.  This is a serious problem: assumed shape arrays are widely used in modern Fortran code, and not being able to interoperate easily with C/C++ in those cases can make the C interop features nearly useless.

Here, we present a standards-compliant way to allocate assumed shape arrays in Fortran and pass them to code written in C++.  This method is fully compliant with the Fortran 2003 and C++98/11 language standards.

Prerequisites

We begin by using a C++ library that provides functionality equivalent to Fortran's assumed shape arrays.  We used Blitz++; however, others have suggested Armadillo and Eigen.  Although we have not investigated, we believe the techniques outlined here would work for these other packages as well.

These techniques will also require a Fortran 2003 compiler that supports the ISO standard Fortran/C interop.

Description

With appropriate information on memory layout, Blitz++ is able to create array structures that reference already-allocated areas of memory.  The required information is stored in the Fortran "dope vector."  Unfortunately, dope vectors are implement-dependent, and cannot be read directly.

We proceed by reconstructing the Fortran dope vector using a series of standard Fortran 2003 calls.  Consider the following structure, which can store information on a 2-dimensional dope vector:
type, bind(c) :: arr_spec_2    type(c_ptr) :: base    type(c_ptr) :: deltas(2    integer(c_int) :: lbounds(2)    integer(c_int) :: ubounds(2)end type
We then fill it in.  Here is how that can be done for 2-D assumed shape arrays of type real*8:
    function c_loc_double(x)
        use, intrinsic :: iso_c_binding
        real*8, target :: x
        type(c_ptr) :: c_loc_double
        c_loc_double = c_loc(x)
    end function
    ! ================ Type (real*8, double), Rank 2
    subroutine get_spec_double_2(arr, low1,low2, spec)
    implicit none
    real*8, dimension(:,:), target :: arr
    integer :: low1,low2
    type(arr_spec_2) :: spec
        spec%base = c_loc_double( arr(lbound(arr,1),lbound(arr,2)) )
     
        ! ------- Dimension 1
        spec%lbounds(1) = low1
        spec%ubounds(1) = low1 + ubound(arr,1) - 1
        if (spec%lbounds(1) < spec%ubounds(1)) then
            spec%deltas(1) = c_loc_double(arr(lbound(arr,1)+1,lbound(arr,2)))
        else
            spec%deltas(1) = spec%base
        end if
        ! ------- Dimension 2
        spec%lbounds(2) = low2
        spec%ubounds(2) = low2 + ubound(arr,2) - 1
        if (spec%lbounds(2) < spec%ubounds(2)) then
            spec%deltas(2) = c_loc_double(arr(lbound(arr,1),lbound(arr,2)+1))
        else
            spec%deltas(2) = spec%base
        end if
    end subroutine
A few things to note:

  1. The function c_loc_double(x) is required because c_loc() does not seem to work directly on elements of arrays (at least not in GNU's gfortran).
  2. The deltas() array contains the address of the second element in each dimension.  True skip offsets for each dimension can be computed by spec%deltas(i) - spec%base.  This is not done because Fortran does not have pointer arithmetic.

Fortran 2003 does not preserve the lower bound of an array when you pass it.  Therefore, the lower bounds of each dimension have to be passed into the subroutine get_spec_double_2().  In order to make life more convenient, this can be hidden under the hood with the following macro (if one is using Fortran with the C pre-processor):
#define GET_SPEC_DOUBLE_2(arr, spec) \    get_spec_double_2(arr, lbound(arr,1),lbound(arr,2), spec)

With the dope vector stored in a type(arr_spec_2) object, that information may now be passed to a C++ subroutine via a matching C++ structure --- and then reconstituted into a Blitz++ array.  We can use C++ templates to make a version that will work for all types and ranks:
/** Used to accept Fortran 90 arrays and re-constitute them as blitz++ arrays.
@see f90blitz_f.py */
template<class ArrT, int rank>
struct F90Array {
    ArrT *base;
    ArrT *deltas[rank];
    int lbounds[rank];
    int ubounds[rank];
    /** Extract F90Array info from existing blitz::Array.  Used to
    write C++ test code for Fortrn APIs. */
    F90Array(blitz::Array<ArrT,rank> &arr) {
        this->base = arr.data();
        blitz::TinyVector<int, rank> idx(0);
        for (int i=0; i<rank; ++i) {
            this->deltas[i] = this->base + arr.stride(i);
            this->lbounds[i] = arr.lbound(i);
            this->ubounds[i] = arr.ubound(i);
        }
    }

    blitz::Array<ArrT,rank> to_blitz()
    {
        blitz::TinyVector<int, rank> shape, stride;
        blitz::GeneralArrayStorage<rank> stor;
        for (int i=0; i<rank; ++i) {
            shape[i] = ubounds[i] - lbounds[i] + 1;
            stride[i] = deltas[i] - base;
            stor.base()[i] = lbounds[i];
            // Ordering is not needed because we're using stride
            // stor.ordering()[i] = i;      // Fortran ordering, blitz++ manual p31
        }
        return blitz::Array<ArrT,rank>(base, shape, stride,
            blitz::neverDeleteData, stor);
    }
};

Example

The code may now be used as follows:


interface    subroutine c_function(arr_spec) bind(c)        type(arr_spec_2) :: arr_spec    end subroutine c_functionend interface
real*8, dimension(:,:) :: arrtype(arr_spec_2) :: arr_spec
    call GET_SPEC_DOUBLE_2(arr, arr_spec)    call c_function(arr_spec)
-------------------------------extern "C" void c_function(F90Array<double, 2> &arr_spec){    auto arr(arr_spec.to_blitz());    printf("Here is a value: %f\n", arr(17,34));}

Generalization

The above code works only for rank-2 arrays of real*8.  The same principles can be applied to extract and use dope vectors for any size and kind of array; but because Fortran lacks templates, writing out the required code would be tedious.  For this reason, we provide a Python script, f90blitz_f.py to generate the required Fortran code.  This script really needs only be run once.  Its usage is:
python f90blitz_f.py <output.f90> <output.h>
The variables fctypes and ranks must be modified inside the script, to control which data types and ranks of arrays get generated.  This can be customized as needed for any particular program.

The required code is only two files: a templated C++ header file, and a Python script to generate the required Fortran files.  That code may be downloaded here:


https://drive.google.com/file/d/0B1tFrRmi_97eRGlmWU1CaVJ4NkE/edit?usp=sharing
https://drive.google.com/file/d/0B1tFrRmi_97eWUdZV3pvZlpyNGs/edit?usp=sharing