From owner-hpff-doc  Thu May  9 10:21:49 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id KAA18389 for hpff-doc-out; Thu, 9 May 1996 10:21:49 -0500 (CDT)
Received: from [128.42.1.213] (morpheus.cs.rice.edu [128.42.1.213]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id KAA18384; Thu, 9 May 1996 10:21:45 -0500 (CDT)
X-Sender: chk@titan.cs.rice.edu
Message-Id: <v01530519adb7cc2ed770@[128.42.1.213]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 9 May 1996 10:22:31 -0600
To: hpff-doc, chk
From: chk@cs.rice.edu (Chuck Koelbel)
Subject: hpff-doc: Testing, 1, 2, 3, ...
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPFF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------



---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Thu May  9 10:26:54 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id KAA18599 for hpff-doc-out; Thu, 9 May 1996 10:26:54 -0500 (CDT)
Received: from [128.42.1.213] (morpheus.cs.rice.edu [128.42.1.213]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id KAA18592 for <hpff-doc>; Thu, 9 May 1996 10:26:49 -0500 (CDT)
X-Sender: chk@titan.cs.rice.edu
Message-Id: <v0153051aadb7cccafbf5@[128.42.1.213]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 9 May 1996 10:27:35 -0600
To: hpff-doc
From: chk@cs.rice.edu (Chuck Koelbel)
Subject: hpff-doc: Welcome to hpff-doc@cs.rice.edu!
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPFF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
[Sorry about that "Testing" message, I was trying to debug a majordomo
configuration remotely...]

Hello, fellow writers & readers!

This is the hpff-doc@cs.rice.edu mailing list.  It is for distribution of
and comments on the HPF version 2.0 document.

The list is maintained by Majordomo, so you may add and delete yourselves
as usual, i.e.

To add yourself to the list:
        mail majordomo@cs.rice.edu << EOF
        subscribe hpff-doc
        EOF

To remove yourself from the list:
        mail majordomo@cs.rice.edu << EOF
        unsubscribe hpff-doc
        EOF

To add someone else (or another alias for yourself) to the list:
         mail majordomo@cs.rice.edu << EOF
        subscribe hpff-doc whoever@where.ever.com
        EOF

You can look for the latest "release" of the document (currently the 1.2
draft that I had, missing the library/intrinsics chapter) in
        ftp://titan.cs.rice.edu/public/HPFF/hpf2.0-draft/
Expect frequent updates.

I see that Rob has sent a new draft of the intrinsics chapter, which will
be installed as soon as bandwidth permits.

                                                                Chuck


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Thu May  9 10:35:37 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id KAA18985 for hpff-doc-out; Thu, 9 May 1996 10:35:37 -0500 (CDT)
Received: from dawn.cs.rice.edu (dawn.cs.rice.edu [128.42.1.127]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id KAA18980; Thu, 9 May 1996 10:35:34 -0500 (CDT)
Received: (from ratner@localhost) by dawn.cs.rice.edu (8.7.3/8.7.3) id KAA16765; Thu, 9 May 1996 10:35:33 -0500 (CDT)
Date: Thu, 9 May 1996 10:35:33 -0500 (CDT)
Message-Id: <199605091535.KAA16765@dawn.cs.rice.edu>
From: Logan Ratner <ratner@cs.rice.edu>
To: chk@cs.rice.edu
CC: hpff-doc@cs.rice.edu
In-reply-to: <v0153051aadb7cccafbf5@[128.42.1.213]> (chk@cs.rice.edu)
Subject: Re: hpff-doc: Welcome to hpff-doc@cs.rice.edu!
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

Oh, by the way, the hpff-doc is configured, up and running.
(I guess everyone already knows this, but consider this the
'official' sysop-type announcement.)

-- 
Logan Ratner (ratner@rice.edu)   *  CRPC, Rice University, Houston
http://www.cs.rice.edu/~ratner/  *  http://softlib.rice.edu/CRPC/
Big Brother is clumsy and obvious,  Its Little Cousins that worry me.
         I am not Ms. Ratner despite what the NSA may think. 
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Thu May  9 14:00:20 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id OAA27974 for hpff-doc-out; Thu, 9 May 1996 14:00:20 -0500 (CDT)
Received: from [128.42.1.213] (morpheus.cs.rice.edu [128.42.1.213]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id OAA27968 for <hpff-doc>; Thu, 9 May 1996 14:00:13 -0500 (CDT)
X-Sender: chk@titan.cs.rice.edu
Message-Id: <v01530525adb7fc8031a0@[128.42.1.213]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 9 May 1996 14:00:59 -0600
To: hpff-doc
From: chk@cs.rice.edu (Chuck Koelbel)
Subject: hpff-doc: RFI - What does "latex" mean to you?
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
HPFF-DOCers -

As (I hope) all of you are aware, we have decided to use LaTeX for the HPF
2.0 document.

In getting the raw text together, I realized/remembered that "LaTeX" has
changed since the time of the HPF 1.0 document.  There are now two rather
different LaTeX systems -
        LaTeX 2e, the newest, fanciest, *supported* system
        LaTeX 2.09, the older, cruftier, system *used for HPF 1.0*
LaTeX 2e has a fair number of new features (for example, support for
imported postscript) and new spellings for a couple of old commands (for
example, {\tt whatever} is now \texttt{whatever}).  The good news is, LaTeX
2e has a "compatability mode" that can process most 2.09 documents - in
particular, compatability mode seems to survive the HPF syntax macros (no
mean feat).


Question for the group:
Should we move to LaTeX 2e for the HPF 2.0 document?
Or should we continue to use LaTex 2.09 (through compatability mode)?


For now, I am keeping text in LaTeX 2.09 form.  I believe that most changes
to LaTeX 2e can be automated.  The HPF Library and Intrinsics chapter will
not be pretty in either form, however...

                                                Chuck


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Thu May  9 14:30:07 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id OAA29246 for hpff-doc-out; Thu, 9 May 1996 14:30:07 -0500 (CDT)
Received: from mail13.digital.com (mail13.digital.com [192.208.46.30]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id OAA29240 for <hpff-doc@cs.rice.edu>; Thu, 9 May 1996 14:30:02 -0500 (CDT)
Received: from mpsg.hpc.pko.dec.com by mail13.digital.com (8.7.5/UNX 1.2/1.0/WV)
	id PAA14175; Thu, 9 May 1996 15:20:17 -0400 (EDT)
Received: by mpsg.hpc.pko.dec.com; id AA11666; Thu, 9 May 1996 15:27:32 -0400
From: offner@hpc.pko.dec.com (Carl Offner)
Received: by hardy.hpc.pko.dec.com; (5.65v3.2/1.1.8.2/01Nov94-0839AM)
	id AA20181; Thu, 9 May 1996 15:19:44 -0400
Date: Thu, 9 May 1996 15:19:44 -0400
Message-Id: <9605091919.AA20181@hardy.hpc.pko.dec.com>
To: hpff-doc@cs.rice.edu
Subject: hpff-doc: Latex2e
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

My vote is for using (the new) Latex2e.  It's almost trivial to convert
(one easy pass in my experience), and Latex2e is friendlier and more
robust.  Plus it's not an "archival standard", like Fortran 77.

		--Carl Offner
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri May 10 13:03:25 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id NAA08456 for hpff-doc-out; Fri, 10 May 1996 13:03:25 -0500 (CDT)
Received: from coral.llnl.gov. (coral.llnl.gov [134.9.1.2]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id NAA08449 for <hpff-doc@cs.rice.edu>; Fri, 10 May 1996 13:03:20 -0500 (CDT)
Message-Id: <199605101803.NAA08449@cs.rice.edu>
Received: by coral.llnl.gov
	(1.40.112.4/16.2) id AA083071397; Fri, 10 May 1996 11:03:17 -0700
Date: Fri, 10 May 1996 11:03:17 -0700
From: Mary E Zosel <zosel@coral.llnl.gov>
To: hpff-doc@cs.rice.edu
Subject: hpff-doc: Re: latex2e
Mime-Version: 1.0
Content-Type: text/plain; charset=X-roman8
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
Chuck ... I don't have any "answer" to your "which latex" question ...
the latex on the system I use is still the older one - but since I
asked about it the local keeper of latex has put it on his to-do-sometime-list
to upgrade. {But I don't know when that will be, so for the short term,
anything I do will be in the older variation.}

 John points out that anyone can find more information at:

http://www.cogs.susx.ac.uk/cgi-bin/texfaq2html?keyword=&question=105
This is part of a Tex faq at
http://www.cogs.susx.ac.uk/cgi-bin/texfaq2html?introduction=yes


   -mary-
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Sat May 11 00:31:24 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id AAA01272 for hpff-doc-out; Sat, 11 May 1996 00:31:24 -0500 (CDT)
Received: from [128.42.5.202] (pasyn-74.rice.edu [128.42.5.202]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id AAA01267 for <hpff-doc>; Sat, 11 May 1996 00:31:19 -0500 (CDT)
X-Sender: chk@titan.cs.rice.edu (Unverified)
Message-Id: <v01530500adb9e27a2e7c@[128.42.1.213]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Sat, 11 May 1996 00:32:10 -0600
To: hpff-doc
From: chk@cs.rice.edu (Chuck Koelbel)
Subject: hpff-doc: HPF 2.0 raw text available
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
And I almost got it done the day it was supposed to be done...

The "raw text" release of HPF 2.0 is now available for downloading at

ftp://titan.cs.rice.edu/public/HPFF/hpf2.0-draft/

You can get the individual files from there, or you can get the whole ball
of wax as

ftp://titan.cs.rice.edu/public/HPFF/hpf2.0-draft/Releases/11-may-96.tar.gz

This version is guaranteed *not* to pass through LaTeX unscathed.  However,
to get a feel for the document structure, see hpf-report.tex; you should be
able to follow the \include commands easily enough.  Most of the file names
have changed from the HPF 1.x directories, in part so that I could keep
things straight while editing.  In general,
        xxx.tex is a chapter from the HPF 2.0 part
            (or is \input from some other file)
        xxx-ext.tex is a chapter from the approved extensions part
        xxx-app.tex is a chapter from the appendix part
        xxx-part.tex is a divider between parts

The debate still rages over whether to use LaTeX 2e or LaTeX 2.09.  Those
not writing text (or writing new text only) generally favor 2e, those who
would have to revise old text tend to prefer 2.09.

OK, time to pack for Chicago now...

                                                Chuck


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Wed May 22 18:01:05 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id SAA01893 for hpff-doc-out; Wed, 22 May 1996 18:01:05 -0500 (CDT)
Received: from coral.llnl.gov. (coral.llnl.gov [134.9.1.2]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id SAA01887; Wed, 22 May 1996 18:00:59 -0500 (CDT)
Message-Id: <199605222300.SAA01887@cs.rice.edu>
Received: by coral.llnl.gov
	(1.40.112.4/16.2) id AA255456059; Wed, 22 May 1996 16:00:59 -0700
Date: Wed, 22 May 1996 16:00:59 -0700
From: Mary E Zosel <zosel@coral.llnl.gov>
To: chk@cs.rice.edu, hpff-doc@cs.rice.edu
Subject: hpff-doc: new subset-app.tex 
Mime-Version: 1.0
Content-Type: text/plain; charset=X-roman8
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

% File: subset-app.tex

% Contents:
% Subset HPF Appendix for HPF 2.0 document

% Revision history:
% May-10-96	Created by Charles Koelbel, Rice University
%		(from HPF 1.1 document)
% May-22-96     Editted by Mary Zosel, LLNL
%               change the first paragraph to reflect the new status



\chapter{Subset HPF}
\label{ch-subset-app}

{\em
Comments on this chpater should be directed to 
Mary Zosel ({\tt zosel@llnl.gov})
and {\tt hpff-doc@cs.rice.edu}.
Please use ``{\tt Comments on Subset HPF}'' as the {\tt Subject:}
line.
\par
}

%This chapter presents a subset of HPF capable of
%being implemented more rapidly than the full HPF.
%A subset implementation will  provide a portable
%interim HPF  capability. Full HPF implementations should
%be developed as rapidly as possible.  The definition of
%the subset  language is intended to be a minimal
%requirement.  A given implementation may support
%additional Fortran 90  and HPF features.

Formally, subset HPF  no longer exists. This section is
included to leave a record of what the subset was. The
intent of the original subset was to develop an interim
HPF capability.  

\section{Fortran 90 Features in Subset High
Performance Fortran}

The items listed here are the features of the HPF subset 
language. For reference, the section numbers from the
Fortran 90 standard are given along with the related
syntax rule numbers:

\sloppy

\begin{itemize}

\item All FORTRAN 77 standard conforming features,
except for storage and sequence association. 
(See Section~\ref{sequence} for detailed discussion of the exception.)

\item The Fortran 90 definitions of MIL-STD-1753 features:

  \begin{itemize}

  \item {\tt DO WHILE} statement (8.1.4.1.1 / R821)

  \item {\tt END DO} statement (8.1.4.1.1 / R825)

  \item {\tt IMPLICIT NONE} statement (5.3 / R540)

  \item {\tt INCLUDE} line (3.4)

  \item scalar bit manipulation intrinsic procedures:
     {\tt IOR}, {\tt IAND}, {\tt NOT}, {\tt IEOR}, {\tt ISHFT}, {\tt ISHFTC},
     {\tt BTEST}, {\tt IBSET}, {\tt IBCLR}, {\tt IBITS}, {\tt MVBITS}  (13.13)

  \item binary, octal and hexadecimal constants for use
  in {\tt DATA} statements (4.3.1.1 / R407 and 5.2.9 / R533)

  \end{itemize}

\item Arithmetic and logical array features:

  \begin{itemize}

  \item array sections (6.2.2.3 / R618--621)

    \begin{itemize}

    \item subscript triplet notation (6.2.2.3.1)

    \item vector-valued subscripts (6.2.2.3.2)

    \end{itemize}

  \item array constructors limited to one level of implied
  {\tt DO} (4.5 / R431)

  \item arithmetic and logical operations on whole arrays and
     array sections (2.4.3, 2.4.5, and 7.1)

  \item array assignment (2.4.5, 7.5, 7.5.1.4, and 7.5.1.5)

  \item masked array assignment (7.5.3)

    \begin{itemize}

    \item {\tt WHERE} statement (7.5.3 / R738)

    \item block {\tt WHERE} . . . {\tt ELSEWHERE} construct (7.5.3 / R739)

    \end{itemize}

  \item array-valued external functions (12.5.2.2)

  \item automatic arrays (5.1.2.4.1)

  \item {\tt ALLOCATABLE} arrays and the {\tt ALLOCATE} and
  {\tt DEALLOCATE} statements
  (5.1.2.4.3, 6.3.1 / R622, and 6.3.3 / R631)

  \item assumed-shape arrays (5.1.2.4.2 / R516)

  \end{itemize}

\item Intrinsic procedures:

The list of intrinsic functions and subroutines below is a combination
of (a) routines which are entirely new to Fortran and (b) routines that have
always been part of Fortran, but now have been extended to new argument
and result types.  The new or extended definitions of these routines
are part of the subset. If a FORTRAN 77 routine is not included in this
list, then only the original FORTRAN 77 definition is part of the
subset.

%\begin{obsolete}
%For all of the intrinsics that have an optional argument {\tt DIM},
%only actual argument expressions for {\tt DIM} that are initialization
%expressions and hence deliver a known shape at compile time are part of
%the subset.  The intrinsics with this constraint are marked with \dag in
%the list below.
%\end(obsolete}

For all of the intrinsics that have an optional argument {\tt DIM},
only actual argument expressions for {\tt DIM} that are initialization
expressions  are part of
the subset.  The intrinsics with this constraint are marked with \dag in
the list below.



  \begin{itemize}

  \item the argument presence inquiry function:
      {\tt PRESENT} (13.10.1)

  \item all the numeric elemental functions:
     {\tt ABS}, {\tt AIMAG}, {\tt AINT}, {\tt ANINT}, {\tt CEILING},
     {\tt CMPLX}, {\tt  CONJG}, {\tt  DBLE}, {\tt DIM}, {\tt DPROD},
     {\tt FLOOR}, {\tt INT}, {\tt MAX}, {\tt MIN}, {\tt MOD}, {\tt
     MODULO}, {\tt NINT}, {\tt REAL}, {\tt SIGN} (13.10.2)


  \item all mathematical elemental functions:
     {\tt ACOS}, {\tt ASIN}, {\tt ATAN}, {\tt ATAN2}, {\tt COS}, {\tt
     COSH}, {\tt EXP}, {\tt LOG}, {\tt LOG10}, {\tt SIN}, {\tt SINH},
     {\tt SQRT}, {\tt TAN}, {\tt TANH} (13.10.3)

  \item all the bit manipulation elemental functions :
     {\tt BTEST}, {\tt IAND}, {\tt IBCLR}, {\tt IBITS}, {\tt IBSET},
     {\tt IEOR}, {\tt IOR}, {\tt ISHFT}, {\tt ISHFTC}, {\tt NOT}
     (13.10.10)

  \item all the vector and matrix multiply functions:
    {\tt DOT_PRODUCT}, {\tt MATMUL} (13.10.13)

  \item all the array reduction functions:
     {\tt ALL}\dag, {\tt ANY}\dag, {\tt COUNT}\dag, {\tt
     MAXVAL}\dag, {\tt MINVAL}\dag, {\tt PRODUCT}\dag, {\tt
     SUM}\dag (13.10.14)

  \item all the array inquiry functions:
     {\tt ALLOCATED}, {\tt LBOUND}\dag, {\tt SHAPE}, {\tt
     SIZE}\dag, {\tt UBOUND}\dag (13.10.15)

  \item all the array construction functions:
     {\tt MERGE}, {\tt PACK}, {\tt SPREAD}\dag, {\tt UNPACK}
     (13.10.16)

  \item the array reshape function:
     {\tt RESHAPE} (13.10.17)

  \item all the array manipulation functions:
     {\tt CSHIFT}\dag, {\tt EOSHIFT}\dag, {\tt TRANSPOSE}
     (13.10.18)

  \item all array location functions:
     {\tt MAXLOC}\dag, {\tt MINLOC}\dag (13.10.19)

  \item all intrinsic subroutines:
      {\tt DATE_AND_TIME}, {\tt MVBITS}, {\tt RANDOM_NUMBER}, {\tt
      RANDOM_SEED}, {\tt SYSTEM_CLOCK} (3.11)

  \end{itemize}

\item Declarations:

  \begin{itemize}

  \item Type declaration statements, with all forms of {\it type-spec}
     except {\it kind-selector} and {\tt TYPE}(type-name), and all forms
     of {\it attr-spec}  except {\it access-spec}, {\tt TARGET}, and {\tt
     POINTER}.  (5.1 / R501-503, R510)


  \item attribute specification statements:
     {\tt ALLOCATABLE}, {\tt INTENT}, {\tt OPTIONAL}, {\tt PARAMETER},
     {\tt SAVE} (5.2)

  \end{itemize}

\item Procedure features:

  \begin{itemize}

  \item {\tt INTERFACE} blocks with no {\it generic-spec} or {\it
     module-procedure-stmt} (12.3.2.1)

  \item optional arguments (5.2.2)

  \item keyword argument passing (12.4.1 /R1212)

  \end{itemize}

\item Syntax improvements:

  \begin{itemize}

  \item long (31 character) names (3.2.2)

  \item lower case letters (3.1.7)

  \item use of ``\_'' in names (3.1.3)

  \item ``!'' initiated comments, both full line and trailing
  (3.3.2.1)

  \end{itemize}

\end{itemize}


\section {Discussion of the Fortran 90 Subset Features}

\begin{rationale}
There are many Fortran 90 features which are useful and relatively easy
to implement, but are not included in the subset language.  Features
were selected for the subset language for several reasons.

The MIL-STD-1753 features have been implemented so widely that many
users have forgotten that they are not part of FORTRAN 77.  They are
included in the HPF subset.

The biggest addition to FORTRAN 77 in the HPF subset language is the
inclusion of the array language. A number of vendors have identified
the usefulness of array operations for concise expression of
parallelism and  already support these features. However, the character
array language is not part of the subset.  

The new storage classes such as allocatable, automatic, and
assumed-shape objects are included in the subset.  They provide an
important alternative to the use of storage association features such
as {\tt EQUIVALENCE} for memory management.

Interface blocks have been added to the subset in order to facilitate
use of the HPF directives across subroutine boundaries.  The interface
blocks provide a mechanism to specify the expected mapping of data, in
addition to the types and intents of the arguments.

There were other Fortran 90 features considered for the subset. Some
features such as {\tt CASE} or {\tt NAMELIST} were recognized as popular
features of Fortran 90, but had no direct bearing on high performance.  Other
features such as support for double precision complex (via {\tt KIND}) or
procedureless {\tt MODULES} were rejected because of the perception that the
additional implementation complexity might delay release of subset
compilers. It was not a goal of HPFF to define an ``ideal'' subset of
Fortran 90 for all purposes.

Additional syntactic improvements are included, such as long names and
the ``!'' form of comments, because of their general usefulness in
program documentation, including the description of HPF itself.
\end{rationale}


\section{HPF Features Not in Subset High Performance Fortran}

All HPF directives  and language extensions are included
in the HPF subset language with the following exceptions:

\begin{itemize}

\item The {\tt REALIGN}, {\tt REDISTRIBUTE}, and {\tt DYNAMIC} directives;


%\begin(obsolete}
%\item 
%The {\tt INHERIT} directive used with a {\it dist-format-clause}
%or
%{\it dist-target} that is transcriptive (``lone star'') either
%explicitly or implicitly;
%\end(obsolete}

\item  
The {\tt INHERIT} directive.

\item The {\tt PURE} function attribute;

\item The {\it forall-construct};

\item The HPF library and the {\tt HPF_LIBRARY} module;

\item Actual argument expressions corresponding to optional {\tt DIM}
arguments to the Fortran 90 {\tt MAXLOC} and {\tt MINLOC} intrinsic
functions that are not initialization expressions; and

\item The {\tt EXTRINSIC} function attribute.

\end{itemize}


\section {Discussion of the HPF Extension Subset}

\begin{rationale}
The data mapping features of the HPF subset are limited to static
mappings, plus the possible remapping of arguments across the interface
of subprogram boundaries. Since the subset language does not include
{\tt MODULES}, and {\tt COMMON} block variables cannot be remapped,
this restriction only impacts remapping of local variables and
additional remapping of arguments, after the subprogram boundary.

%\begin{obsolete}
%The {\tt INHERIT} directive may be used in the subset, but the user must
%provide an explicit descriptive or prescriptive distribution for
%the dummy argument in question.
%\end{obsolete}

The {\tt INHERIT} directive is no longer included in the subset. The
case where it is most useful (to describe the template of the full
array, when only a section of an array is passed as an argument)
cannot not  be declared properly with the former restriction on use of
transcriptive distributions, combined with the fact that processor
directives cannot be used to describe only parts of the processor set.
Only the simplest version of {\tt FORALL} statement is required in the
subset. Note that the omission of the {\tt PURE} attribute from the
subset means that only HPF and Fortran 90 intrinsic functions can be
called from the {\tt FORALL} statement.  No other subprograms can be
called.

Only the intrinsics which are useful for declaration of variables and
mapping inquiries are included in the subset.  The full set of
extended operations proposed for the  HPF library is not required and
since {\tt MODULE} is not part of the subset, the {\tt HPF_LIBRARY} module
is also not part of the subset. The extrinsic interface attribute is
also not in the subset.  This includes any specific extrinsic models
such as the model described in the Annex~\ref{LOCAL-ANNEX}.

All of these HPF language reductions are made in the spirit of
allowing  vendors to produce a usable subset version of HPF quickly so
that initial experimentation with the language can begin. This list of
HPF features excluded from the subset should not be interpreted as
requiring implementors to omit the features from the subset.
Implementations  with as many HPF features as possible are encouraged.
The list does, however, establish the features a user should avoid if
an HPF application is expected to be moved between different HPF subset
implementations.
\end{rationale}

\fussy

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Wed May 22 18:33:54 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id SAA02659 for hpff-doc-out; Wed, 22 May 1996 18:33:54 -0500 (CDT)
Received: from coral.llnl.gov. (coral.llnl.gov [134.9.1.2]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id SAA02652 for <hpff-doc@cs.rice.edu>; Wed, 22 May 1996 18:33:49 -0500 (CDT)
Message-Id: <199605222333.SAA02652@cs.rice.edu>
Received: by coral.llnl.gov
	(1.40.112.4/16.2) id AA259818028; Wed, 22 May 1996 16:33:48 -0700
Date: Wed, 22 May 1996 16:33:48 -0700
From: Mary E Zosel <zosel@coral.llnl.gov>
To: hpff-doc@cs.rice.edu
Subject: hpff-doc: new credits.tex
Mime-Version: 1.0
Content-Type: text/plain; charset=X-roman8
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

% File: credits.tex

% Contents:
% Acknowledgements for HPF 2.0 document

% Revision history:
% May-10-96	Created by Charles Koelbel, Rice University
%		(copied from HPF 1.1 document)



\chapter*{Acknowledgments}


{\em
Comments on this chpater should be directed to 
Mary Zosel ({\tt zosel@llnl.gov}),
Charles Koelbel ({\tt chk@cs.rice.edu}),
and {\tt hpff-doc@cs.rice.edu}.
Please use ``{\tt Comments on Acknowledgments}'' as the {\tt Subject:}
line.
\par
}

Since its introduction over three decades ago, Fortran has been the
language of choice for scientific programming for sequential
computers.  Exploiting the full capability of modern architectures,
however, increasingly requires more information than ordinary FORTRAN
77 or Fortran 90 programs provide.  This information applies to such
areas as:

\begin{itemize}

\item Opportunities for parallel execution;

\item Type of available parallelism --- MIMD, SIMD, or some combination;

\item Allocation of data among individual processor memories; and

\item Placement of data within a single processor.

\end{itemize}

The High Performance Fortran Forum (HPFF) was founded as a coalition of
industrial and academic groups working to suggest a set of standard
extensions to Fortran to provide the necessary information.  Its intent
was to develop extensions to Fortran that provide support for high
performance programming on a wide variety of machines, including
massively parallel SIMD and MIMD systems and vector processors.  From
its beginning, HPFF included most vendors delivering parallel machines,
a number of government laboratories, and many university research
groups.  Public input was encouraged to the greatest extent possible.
The result of this project is this document, intended to be a language
specification portable from workstations to massively parallel
supercomputers while being able to express the algorithms needed to
achieve high performance on specific architectures.

\section{HPFF Acknowledgements}
\label{hpff-ack}

Technical development for HPF 1.0  was carried out by subgroups, and
was reviewed by the full committee.  Many people served in positions of 
responsibility:

\begin{itemize}

\item Ken Kennedy, Convener and Meeting Chair;

\item Charles Koelbel, Executive Director and Head of the FORALL Subgroup;

\item Mary Zosel, Head of the Fortran 90 and Storage Association Subgroup;

\item Guy Steele, Head of the Data Distribution Subgroup;

\item Rob Schreiber, Head of the Intrinsics Subgroup;

\item Bob Knighten, Head of the Parallel I/O Subgroup;

\item Marc Snir, Head of the Extrinsics Subgroup;

\item Joel Williamson and Marina Chen, Heads of the Subroutine 
Interface Subgroup; and

\item David Loveman, Editor.

\end{itemize}



Geoffrey Fox convened the first HPFF meeting with Ken Kennedy and
later led a group to develop benchmarks for HPF.  
Clemens-August Thole organized a  group in Europe and was
instrumental in making this an international effort.  Charles Koelbel
produced detailed meeting  minutes
that were invaluable to
subgroup heads in preparing successive revisions to the draft
proposal.  Guy Steele developed \LaTeX\ macros for a variety of tasks, 
including formatting BNF grammar, Fortran code and pseudocode, and 
commentary material; the document would have been much less aesthetically 
pleasing without his efforts.

Many companies, universities, and other entities supported their
employees' attendance at the HPFF meetings, both directly and
indirectly.  The following organizations were represented at two or
more meetings by the following individuals (not including those present
at the first HPFF meeting in January of 1992, for which there is no 
accurate attendee list):

%Organization \hfill Attendee 
Alliant Computer Systems Corporation \dotfill  David Reese
\hspace{.125in}

Amoco Production Company  \dotfill  Jerrold Wagener, Rex Page
\hspace{.125in}

Applied Parallel Research  \dotfill John Levesque, Rony Sawdayi, Gene
Wagenbreth  \hspace{.125in}

Archipel  \dotfill  Jean-Laurent Philippe \hspace{.125in}

CONVEX Computer Corporation  \dotfill  Joel Williamson \hspace{.125in}

Cornell Theory Center  \dotfill  David Presberg \hspace{.125in}

Cray Research, Inc.  \dotfill  Tom MacDonald, Andy Meltzer
\hspace{.125in}

Digital Equipment Corporation  \dotfill  David Loveman
\hspace{.125in}

Fujitsu America \dotfill  Siamak Hassanzadeh, Ken Muira \hspace{.125in}

Fujitsu Laboratories \dotfill Hidetoshi Iwashita \hspace{.125in}

GMD-I1.T, Sankt Augustin  \dotfill  Clemens-August Thole
\hspace{.125in}

Hewlett Packard  \dotfill  Maureen Hoffert, Tin-Fook Ngai, Richard
Schooler \hspace{.125in}

IBM  \dotfill  Alan Adamson, Randy Scarborough, Marc Snir, Kate Stewart
\hspace{.125in}

Institute for Computer Applications in Science \& Engineering  \dotfill
    Piyush Mehrotra \hspace{.125in} 

Intel Supercomputer Systems Division  \dotfill  Bob Knighten
\hspace{.125in}

Lahey Computer  \dotfill  Lev Dyadkin, Richard Fuhler, Thomas Lahey,
Matt Snyder \hspace{.125in}

Lawrence Livermore National Laboratory  \dotfill  Mary Zosel
\hspace{.125in}

Los Alamos National Laboratory  \dotfill  Ralph Brickner, Margaret
Simmons \hspace{.125in}

Louisiana State University  \dotfill  J. Ramanujam \hspace{.125in}

MasPar Computer Corporation  \dotfill  Richard Swift \hspace{.125in}

Meiko, Inc.  \dotfill  James Cownie \hspace{.125in}

nCUBE, Inc.  \dotfill  Barry Keane, Venkata Konda \hspace{.125in}

Ohio State University  \dotfill  P. Sadayappan \hspace{.125in}

Oregon Graduate Institute of Science and Technology  \dotfill  Robert
Babb II \hspace{.125in}

The Portland Group, Inc.  \dotfill  Vince Schuster \hspace{.125in}

Research Institute for Advanced Computer Science  \dotfill  Robert
Schreiber \hspace{.125in}

Rice University  \dotfill  Ken Kennedy, Charles Koelbel \hspace{.125in}

Schlumberger  \dotfill  Peter Highnam \hspace{.125in}

Shell  \dotfill  Don Heller \hspace{.125in}

State University of New York at Buffalo  \dotfill  Min-You Wu
\hspace{.125in}

SunPro and Sun Microsystems  \dotfill  Prakash Narayan, Douglas Walls
\hspace{.125in}

Syracuse University  \dotfill  Alok Choudhary, Tom Haupt
\hspace{.125in}

TNO-TU Delft  \dotfill  Edwin Paalvast, Henk Sips \hspace{.125in}

Thinking Machines Corporation  \dotfill  Jim Bailey, Richard Shapiro,
Guy Steele  \hspace{.125in}

United Technologies Corporation  \dotfill  Richard Shapiro
\hspace{.125in}

University of Stuttgart  \dotfill 
 Uwe Geuder, Bernhard Woerner, Roland Zink \hspace{.125in}

University of Southampton \dotfill John Merlin \hspace{.125in}

University of Vienna  \dotfill  Barbara Chapman, Hans Zima
\hspace{.125in}

Yale University  \dotfill  Marina Chen, Aloke Majumdar \hspace{.125in}

Many people contributed sections to the final 
language specification and HPF Journal of Development, including 
Alok Choudhary,
Geoffrey Fox,
Tom Haupt,
Maureen Hoffert,
Ken Kennedy,
Robert Knighten,
Charles Koelbel,
David Loveman,
Piyush Mehrotra,
John Merlin,
Tin-Fook Ngai,
Rex Page,
Sanjay Ranka,
Robert Schreiber,
Richard Shapiro,
Marc Snir,
Matt Snyder,
Guy Steele,
Richard Swift,
Min-You Wu,
and
Mary Zosel.
Many others contributed shorter passages and examples and corrected errors. 

Because public input was encouraged on electronic mailing lists, it is
impossible to identify all who contributed to
discussions;  the entire mailing list was over 500 names
long. Following are  some of the active participants in
the HPFF process not mentioned above:

\begin{centering}

\begin{tabular}{lll}
N. Arunasalam &
Werner Assmann &
Marc Baber \\
Babak Bagheri &
Vasanth Bala &
Jason Behm \\
Peter Belmont &
Mike Bernhardt &
Keith Bierman \\
Christian Bishof &
John Bolstad &
William Camp \\
Duane Carbon & 
Richard Carpenter &
Brice Cassenti \\
Doreen Cheng &
Mark Christon &
Fabien Coelho \\
Robert Corbett &
Bill Crutchfield &
J. C. Diaz \\
James Demmel &
Alan Egolf &
Bo Einarsson \\
Pablo Elustondo &
Robert Ferrell &
Rhys Francis \\
Hans-Hermann Frese &
Steve Goldhaber &
Brent Gorda \\
Rick Gorton &
Robert Halstead &
Reinhard von Hanxleden \\
Hiroki Honda &
Carol Hoover &
Steven Huss-Lederman \\
Ken Jacobsen &
Elaine Jacobson &
Behm Jason \\
Alan Karp &
Ronan Keryell &
Anthony Kimball \\
Ross Knippe &
Bruce Knobe &
David Kotz \\
Ed Krall &
Tom Lake &
Peter Lawrence \\
Bryan Lawver &
Bruce Leasure &
Stewart Levin \\
David Levine &
Theodore Lewis &
Woody Lichtenstein \\
Ruth Lovely &
Doug MacDonald &
Raymond Man \\
Stephen Mark &
Philippe Marquet &
Jeanne Martin \\
Oliver McBryan &
Charlie McDowell &
Michael Metcalf \\ 
Charles Mosher &
Len Moss &
Lenore Mullin \\
Yoichi Muraoka &
Bernie Murray &
Vicki Newton \\
Dale Nielsen &
Kayutov Nikolay &
Steve O'Neale \\
Jeff Painter &
Cherri Pancake &
Harvey Richardson \\
Bob Riley &
Kevin Robert &
Ron Schmucker \\
J.L. Schonfelder &
Doug Scofield &
David Serafini \\
G.M. Sigut &
Anthony Skjellum &
Niraj Srivastava \\
Paul St.Pierre &
Nick Stanford &
Mia Stephens \\
Jaspal Subhlok &
Xiaobai Sun &
Hanna Szoke \\
Bernard Tourancheau &
Anna Tsao &
Alex Vasilevsky \\
Stephen Vavasis &
Arthur Veen &
Brian Wake \\
Ji Wang &
Karen Warren &
D.C.B. Watson \\
Matthijs van Waveren &
Robert Weaver &
Fred Webb \\
Stephen Whitley &
Michael Wolfe &
Fujio Yamamoto \\
Marco Zagha 

\end{tabular}

\end{centering}

The following organizations made the language draft available by anonymous FTP 
access and/or mail servers:
AT\&T Bell Laboratories,
Cornell Theory Center,
GMD-I1.T (Sankt Augustin), 
Oak Ridge National Laboratory,
Rice University,
Syracuse University,
and Thinking Machines Corporation.
These outlets were instrumental in distributing the document.

The High Performance Fortran Forum also received a great deal of
volunteer effort in nontechnical areas.  Theresa Chatman and Ann
Redelfs were responsible for most of the meeting planning and
organization, including the first HPFF meeting, which drew over 125
people.  Shaun Bonton, Rachele Harless, Rhonda Perales, Seryu Patel, and 
Daniel Swint
helped with many logistical details.  Danny Powell spent a great deal
of time handling the financial details of the project.  Without these
people, it is unlikely that HPF would have been completed.

HPFF operated on a very tight budget (in reality, it had no budget when
the first meeting was announced).  The first meeting in Houston was
entirely financed from the conferences budget of the Center for
Research on Parallel Computation, an NSF Science and Technology
Center.  DARPA and NSF have supported research at various institutions
that have made a significant contribution towards the development of
High Performance Fortran.  Their sponsored projects at Rice, Syracuse,
and Yale Universities were particularly influential in the HPFF
process.  Support for several European participants was provided by
ESPRIT  through projects P6643 (PPPE) and P6516 (PREPARE).  
%{\it Mention anybody else who gave us money here.}


\section{HPFF94 Acknowledgements}
\label{hpff94-ack}

The HPF 1.1 version of the document was prepared during the HPFF94
series of meetings. A number of people shared technical  responsibilities for
the activities of the HPFF94 meetings:

\begin{itemize}

\item Ken Kennedy, Convener and Meeting Chair;

\item Mary Zosel, Executive Director and head of CCI Group 2;

\item Richard Shapiro, Head of CCI Group 1;

\item Ian Foster, Head of Tasking Subgroup;

\item Alok Choudhary, Head of Parallel I/O Subgroup;

\item Chuck Koelbel, Head of Irregular Distributions Subgroup;

\item Rob Schreiber, Head of Implementation Subgroup;

\item Joel Saltz, Head of Benchmarks Subgroup;

\item David Loveman, Editor, assisted by Chuck Koelbel,
Rob Schreiber, Guy Steele,
and Mary Zosel, section editors.

\end{itemize}

Attendence at the HPFF94 meetings included the following people from
organizations that were represented two or more times.


%Attendee \hfill Organization
Don Heller  \dotfill  Ames Laboratory \hspace{.125in}

Jerrold Wagener  \dotfill  Amoco Production Company
\hspace{.125in}

John Levesque   \dotfill Applied Parallel Research \hspace{.125in}

Ian Foster   \dotfill Argonne National Laboratory \hspace{.125in}

Terry Pratt \dotfill  CESDIS/NASA Goddard \hspace{.125in}

Jim Cowie \dotfill  Cooperating Systems \hspace{.125in}

Andy Meltzer, Jon Steidel \dotfill Cray Research, Inc. 
\hspace{.125in}

David Loveman \dotfill  Digital Equipment Corporation 
\hspace{.125in}

Bruce Olsen \dotfill Hewlett Packard \hspace{.125in}

E. Nunohiro, Satoshi Itoh \dotfill  Hitachi \hspace{.125in}

Henry Zongaro  \dotfill  IBM \hspace{.125in}

Piyush Mehrotra \dotfill 
Institute for Computer Applications in Science
\& Engineering \hspace{.125in}

Bob Knighten, Roy Touzeau \dotfill  Intel SSD  
\hspace{.125in}

Mary Zosel, Bor Chan, Karen Warren \dotfill  
Lawrence Livermore National Laboratory \hspace{.125in}

Ralph Brickner \dotfill Los Alamos National Laboratory \hspace{.125in}

J. Ramanujam \dotfill Louisiana State University \hspace{.125in}

Paula Vaughan, Donna Reese \dotfill 
Miss. State University / NSF ERC \hspace{.125in}

Shoichi Sakon, Yoshiki Seo\dotfill  NEC \hspace{.125in}

P. Sadayappan, Chua-Huang Huang \dotfill  Ohio State University
\hspace{.125in}

Andrew Johnson \dotfill  OSF Research Institute \hspace{.125in}

Chip Rodden, Jeff Vanderlip \dotfill Pacific Sierra Research
\hspace{.125in}

Larry Meadows, Doug Miles \dotfill  The Portland Group, Inc.
\hspace{.125in}

Robert Schreiber \dotfill  
Research Institute for Advanced Computer Science \hspace{.125in}

Ken Kennedy, Charles Koelbel \dotfill  Rice University \hspace{.125in}

Ira Baxter \dotfill  Schlumberger \hspace{.125in}

Alok Choudhary
 \dotfill  Syracuse University \hspace{.125in}

Guy Steele  \dotfill  Thinking Machines Corporation, Sun MicroSystems
\hspace{.125in}

Richard Shapiro \dotfill 
 Thinking Machines Corp., Silicon Graphics Inc. \hspace{.125in}

Scott Baden, Val Donaldson\dotfill 
University of California, San Diego \hspace{.125in}

Robert Babb  \dotfill   University of Denver \hspace{.125in}

Joel Saltz, Paul Havlak \dotfill University of Maryland \hspace{.125in}

Nicole Nemer-Preece \dotfill  University of
Missouri-Rolla \hspace{.125in}

Hans Zima, Siegfried Benkner, Thomas Fahringer \dotfill University of
Vienna \hspace{.125in}

An important activity of HPFF94 was the processing of the many items
submitted for comment and interpretation which led to the HPF 1.1
update of the language document.  A special acknowlegement
goes to Henry Zongaro, IBM, for many thoughtful questions 
exposing dark corners of language design that 
were previously overlooked, and to
Guy Steele, Thinking Machines/Sun Microsystems for his analysis
of, and solutions for some of the thornier issues discussed.
And general thanks to the people who submitted comments and
interpretation requests, including:

David Loveman, Michael Hennecke, James Cownie,
Adam Marshall, Stephen Ehrlich, Mary Zosel, Matt Snyder,
Larry Meadows, Dick Hendrickson,
Dave Watson, John Merlin, Vasanth Bala, Paul.Wesson,
Denis.Hugli, Stanly Steinberg, Henk Sips, Henry Zongaro,
Eiji_Nunohiro, Jens Bloch Helmers, Rob Schreiber, 
David B. Serafini, and Allan Knies.

Other special mention goes to  Chuck Koelbel at
Rice University for continued 
maintenance of the HPFF mailing lists, to Donna Reese and staff at
Mississippi State University for establishing and maintaining a
WWW home-page for HPFF,
and to the University of Maryland for
establishing a benchmark FTP site.

Theresa Chatman and staff at Rice University were responsible for
meeting planning and organization and Danny Powell 
continued to handle financial details of the project.

HPFF94 received direct support for research and administrative
activities from grants from ARPA, DOE, and NSF.

\section{HPFF 2 Acknowledgements}
\label{hpff2-ack}

The HPF 2.0 version of the document was prepared during 
series of meetings in 1995-1996. 
A number of people shared technical  responsibilities for
the activities of these  meetings:

\begin{itemize}

\item Ken Kennedy, Convener and Meeting Chair;

\item Mary Zosel, Executive Director;

\item Rob Schreiber, Organizer for Control Subgroup;

\item Piyush Mehrotra and Guy Steele, Organizers for Distribution Subgroup;

\item David Loveman, Organizer for External Subgroup;

\item  Chuck Koelbel, Editor, assisted by a hoard of screaming
extras whose names will be listed here after we determine who does
the work.

\end{itemize}

Attendence at the HPFF 2 meetings included the following people from
organizations that were represented two or more times. NOTE THIS LIST
IS IN DRAFT FORM - IT IS NEITHER COMPLETE OR ORDERED PROPERLY.


%Attendee \hfill Organization

John Levesque   \dotfill Applied Parallel Research \hspace{.125in}

Ian Foster   \dotfill Argonne National Laboratory \hspace{.125in}

Joel Williamson \dotfill Convex/Hewlett Packard \hspace{.125in

Jim Cowie \dotfill  Cooperating Systems \hspace{.125in}

Andy Meltzer \dotfill Cray Research, Inc.
\hspace{.125in}

David Loveman, Carl Offner \dotfill  Digital Equipment Corporation
\hspace{.125in}

E. Nunohiro, Satoshi Itoh \dotfill  Hitachi \hspace{.125in}

Henry Zongaro  \dotfill  IBM \hspace{.125in}

Piyush Mehrotra \dotfill
Institute for Computer Applications in Science
\& Engineering \hspace{.125in}

Mary Zosel  \dotfill
Lawrence Livermore National Laboratory \hspace{.125in}

Robert Boland \dotfill Los Alamos National Laboratory \hspace{.125in}

J. Ramanujam \dotfill Louisiana State University \hspace{.125in}

Yoshiki Seo\dotfill  NEC \hspace{.125in}

P. Sadayappan,  \dotfill  Ohio State University
\hspace{.125in}

Larry Meadows \dotfill  The Portland Group, Inc.
\hspace{.125in}

Robert Schreiber \dotfill
Research Institute for Advanced Computer Science / HP \hspace{.125in}

Ken Kennedy, Charles Koelbel \dotfill  Rice University \hspace{.125in}

Alok Choudhary, Hon Y
\dotfill  Syracuse University \hspace{.125in}

Guy Steele  \dotfill  Sun MicroSystems
\hspace{.125in}

Carol Monroe \dotfill
Thinking Machines Corp. \hspace{.125in}

Scott Baden, Steve Fink, Jay Boisseau \dotfill
University of California, San Diego \hspace{.125in}

Robert Babb  \dotfill   University of Denver \hspace{.125in}

Joel Saltz, Paul Havlak \dotfill University of Maryland \hspace{.125in}

Jerrold Wagener  \dotfill X3J3, University of Oklahoma
\hspace{.125in}

Barbara Chapman, Siegfried Benkner, Guy Robinson \dotfill University of
Vienna \hspace{.125in}

An important activity of HPFF 2 meetings was the processing of the many items
submitted for comment and interpretation. 
A special acknowlegement
goes to Henry Zongaro, IBM, (again) for many thoughtful questions
exposing dark corners of language design that
were previously overlooked. 
And general thanks to the people who submitted comments and
interpretation requests, including:

THIS IS AN OLD LIST - UPDATE NEEDED FOR NEW DOCUMENT
David Loveman, Michael Hennecke, 
Adam Marshall, Stephen Ehrlich, 
Larry Meadows, 
Henry Zongaro,
Eiji_Nunohiro, Jens Bloch Helmers, Rob Schreiber,

Other special mention goes to  Chuck Koelbel at
Rice University for continued
maintenance of the HPFF mailing lists.

Theresa Chatman and staff at Rice University were responsible for
meeting planning and organization and Danny Powell
continued to handle financial details of the project.

HPFF  2 received direct support for research and administrative
activities from grants from ARPA, DOE, and NSF.

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Tue May 28 15:01:47 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id PAA01437 for hpff-doc-out; Tue, 28 May 1996 15:01:47 -0500 (CDT)
Received: from pacific.pgroup.com (pacific.pgroup.com [192.124.124.8]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id PAA01428 for <hpff-doc@cs.rice.edu>; Tue, 28 May 1996 15:01:40 -0500 (CDT)
Received: from stealth.pgroup.com (stealth [192.124.124.22]) by pacific.pgroup.com (8.7.5/8.6.11) with ESMTP id NAA18983 for <hpff-doc@cs.rice.edu>; Tue, 28 May 1996 13:01:37 -0700 (PDT)
From: Larry Meadows <lfm@pgroup.com>
Received: (from lfm@localhost) by stealth.pgroup.com (8.7.5/8.7.2) id NAA04089 for hpff-doc@cs.rice.edu; Tue, 28 May 1996 13:01:37 -0700 (PDT)
Date: Tue, 28 May 1996 13:01:37 -0700 (PDT)
Message-Id: <199605282001.NAA04089@stealth.pgroup.com>
To: hpff-doc@cs.rice.edu
Subject: hpff-doc: io-ext.tex
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

% File: io-ext.tex

% Contents:
% Approved Extension for EXTRINSIC(HPF_LOCAL) for HPF 2.0 document

% Revision history:
% May-28-96	Created by Larry Meadows, The Portland Group


\chapter{Approved Extension for Asynchronous I/O}
\label{ch-io-ext}

{\em
Comments on this chapter should be directed to 
{\tt hpff-doc@cs.rice.edu}.
Please use ``{\tt Comments on Asynch I/O}'' as the {\tt Subject:}
line.
\par
}


This section defines a mechanism for performing Asynchronous I/O
from an HPF or Fortran program. These are presented as changes
to the Fortran 90 standard.

To section 9.4.1, Rule R912, add

                                                                        \BNF
io-control-spec            \IS ID = scalar-default-int-variable
                                                                        \FNB

To section 9.4.1, Rule R912 constraints add

\begin{constraints}

\item If an {\tt ID=} specifier appears, then no function reference may appear
in an expression anywhere in the data transfer statement.

\end{constraints}

To section 9.7, add the following text:

If an {\tt ID=} specifier appears, then no function reference may appear
in an expression anywhere in the data transfer statement.

At the end of section 9.4.1, add the following paragraph:

The addition of the {\tt ID=} specifier results in the initiation of an
asynchronous data transfer.  The data transfer statement must be
eventually followed by an {\tt WAIT} statement specifing the same
{\tt ID} value that was returned to the {\tt ID} variable in an data transfer
statement.  This {\tt WAIT} statement is called the
matching {\tt WAIT} statement. Note that asynchronous data transfer must
be direct and unformatted.

Insert the following text at the end of section 9.4.3 before the
final paragraph:

For an asynchronous data transfer, errors or end-of-file conditions
may occur either during execution of the data transfer statement or
during subsequent data transfer.  Should these conditions not
result in termination of the program then these conditions will
be detectable by the programmer via {\tt ERR=} and {\tt IOSTAT=}
specifiers in the matching {\tt WAIT} statement.

Alter the paragraph at the end of 9.4.3 to read as follows:

Execution of the executable program is terminated if an error condition
occurs during execution or during subsequent data tranfer
of an input/output statement that contains neither
an {IOSTAT=} nor an {ERR=} specifier.

Should an asynchronous data transfer statement cause the {\tt ERR} or
{\tt IOSTAT}
variables to be set then any matching
{\tt WAIT} statement will do the same.

To section 9.4.4, alter operation 8 to read as follows:

\begin{itemize}

\item  (8) Cause any variables specified in the {\tt IOSTAT=},
{\tt SIZE=} and
{\tt ID=| specifiers to become defined.

\end{itemize}

To section 9.4.4, add the following three paragraphs:

For asynchronous data transfers steps 1-8 correspond to both
the asynchronous data transfer statement and the matching {\tt WAIT}
statement.  Steps 4-7 may occur asynchronously with program
execution.  If an implementation does not support asynchronous
data transfers then steps 1-8 may be performed by the asynchronous
data transfer statement,  the matching {\tt WAIT} statement must still
be executed, the only effect being to return status information.

In the portion of the program which executes between the asynchronous data
transfer input/output statement and the matching {\tt WAIT},
no entity appearing
in an expression anywhere in the input/output list may be assigned to
or accessed.

Multiple outstanding asynchronous data transfer operations are allowed
but they must all be reads or
writes. No other I/O statements on the same unit are allowed until all
{\tt WAIT} requests are satisfied.
If two {\tt WRITE} statements which specify the same
record number are executed, then the program is non-conforming.

Advice to users:

Note: we still permit left-to-right definition of the I/O list on a READ.
So READ(10,ID=5,REC=10) I,A(I) is legal and has the same semantics as a
synchronous READ ]

End advice to users.

To section 9, change "and {\tt INQUIRE} statements" to
", {\tt INQUIRE}, and {\tt WAIT} statements".

Add a new section 9.4.7:

9.4.7 WAIT statement
                                                                        \BNF

        WAIT statement \IS WAIT (wait-spec-list)
        wait-spec \IS ID = scalar-default-int-expr
                  \OR ERR = label
                  \OR IOSTAT= label
                  \OR DONE=scalar-default-logical-variable
                                                                        \FNB


\begin{constraints}

\item The {\tt ERR=} and {\tt IOSTAT=} specifiers may only be present
if they were present in the matching asynchronous
data transfer statement.

\end{constraints}

The {\tt WAIT} statement terminates an asynchronous data transfer.
The {\tt ID=} specifier must be present
The {\tt IOSTAT=}, and {\tt ERR=} specifiers are optional and are described in
sections 9.4.1.4, 9.4.1.5 and 9.4.1.6, respectively.

The {\tt DONE=} specifier is optional. If present, then the
scalar-default-logical-variable is set to {\tt .TRUE.} if the asynchronous
operation is complete, and to {\tt .FALSE.} if it is not complete.

The {\tt WAIT} statement causes the processor
to wait until the matching data transfer statement terminates,
either normally, or with an error condition. After execution
of the matching {\tt WAIT} statement, error conditions, control transfer,
data transfer, and the value of any {\tt IOSTAT=} variable are as if the
data transfer statement had been executed synchronously in place of the
matching {\tt WAIT} statement, and with the addition of any {\tt IOSTAT=}
and {\tt ERR=},
specifiers to the {\it io-control-spec-list} of the data transfer
statement.

If the {\tt DONE} specifier is present, and the returned value is
{\tt .FALSE.}, than one
or more matching {\tt WAIT} statement must be executed, until either the
{\tt DONE=} specifier is not present, or the returned value is {\tt .TRUE.}


Advice to implementors:

Implementors may choose to implement any or all asynchonous I/O
synchronously. This essentially means using the ID= clause on the
data transfer statement to store the results of the transfer, then
supplying the results to the matching WAIT statement.

End advice to implementors.
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Wed May 29 15:39:44 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id PAA05363 for hpff-doc-out; Wed, 29 May 1996 15:39:44 -0500 (CDT)
Date: Wed, 29 May 1996 15:39:44 -0500 (CDT)
Message-Id: <199605292039.PAA05363@cs.rice.edu>
From: Mary E Zosel <zosel@llnl.gov>
Subject: hpff-doc: extrinsic-app.tex
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------


% File: extrinsic-app.tex

% Contents:
% Extrinsic Interfaces Appendix for HPF 2.0 document, including
%       policy
%       mechanism
%       HPF_CRAFT

% Revision history:
% May-10-96     Created by Charles Koelbel, Rice University
%               (from HPF 2.0 proposals)
% May 29-96     Mary Zosel, LLNL
%               fill in the extrinsic policy and procedures section


\chapter{Extrinsic Interfaces}
\label{ch-extrinsic-app}

{\em
Comments on this chapter should be directed to 
Mary Zosel ({\tt zosel@llnl.gov}),
Andy Meltzer ({\tt meltzer@cray.com}),
and {\tt hpff-doc@cs.rice.edu}.
Please use ``{\tt Comments on Extrinsic Interfaces}'' as the {\tt Subject:}
line.
\par
}


\section{Extrinsic Interface Policy}

HPF defines certain extrinsics such as {\tt HPF_LOCAL}, and 
{\tt HPF_SERIAL} as interfaces that HPFF believes are useful to the HPF 
community. But there are many more such extrinsic  interfaces beyond those 
maintained by HPFF. HPFF has a adopted a policy of formally 
recognizing certain extrinsic interface definitions, where the interface, and its 
addition to the HPF document is considered to be a service to the HPF 
community. Examples are language bindings to HPF or library packages. 

To be considered for HPFF recognition, a proposed extrinsic must 
demonstrate the following things.  
It should be noted, however, that meeting these criteria 
does not guarentee acceptance of a proposed interface by HPFF.

\begin{itemize}
\item    conformance to HPF rules for calling extrinsics  (reference)
\item    significant new functionality
\item    existing practice such as users, implementations, etc.
\item    institutional backing with evidence of ongoing support
\item    coherent documentation
\item    non-proprietary interface definition
\item    copyright goes to HPFF for interface, with permission to 
                 use (royalty free)
\end{itemize}

If a proposed extrinsic is accepted by HPFF, then:
\begin{itemize}

\item    HPFF will recogize the interface and reference it in documentation, but 
HPFF does not assume responsibility for the extrinsic or its interface.

\item    The sponsor of the extrinsic must continue to conform to the HPF 
interface rules for extrinsics.  The interface HPFF approves must not change 
without HPFF approval.

\item    The sponsor must assume responsibility for any CCI requests 
concernting the extrinsic.

\end{itemize}

A list of recognized extrinsic interfaces will be included in HPF 
documentation, with the following guidelines:
\begin{itemize}

\item    There should be a single page introduction to the extrinsic which 
contains:

\begin{itemize}

\item        the name of the extrinsic
\item        a brief abstract of functionality
\item        a brief and informal description of the interface
\item        information about platform and system availability
\item        reference and contacts for formal documentation, continued
                responsibility, and additional information (e.g. compiler 
                availability).

\end{itemize}

\item    There should be about two pages with short examples of usage.

\item     A short paper with the formal definition of the interface and an 
informal description of the functionality of the extrinsic.

\end{itemize}

\section{Extrinsic Interface Mechanism}

To submit an extrinsic interface to HPFF (- SAY HOW -)  for consideration, 
the sponsor prepares a proposal that includes:

\begin{itemize}

\item    a statement of what significant new functionality is provided
\item    a description of existing practice
\item    a statement of institutional backing with evidence of ongoing support
\item    a copy of the complete documentation or a reference to an
            online version of the documentation
\item    a draft of the text (described above) that would be included in 
                 the HPFF  documentation
\item    a statement justifying the claim that the interface follows HPF 
                  conventions for calling extrinsics.
\end{itemize}

If the proposed extrinsic interface is approved 
by HPFF, the sponsor then submits:

\begin{itemize}
\item    a formal statement for HPFF records that the interface definition is
                non-proprietary and that  the copyright of the
                interface belongs HPFF.
\item    the formal contact for CCI and continued maintenance of the interface.
\item    a copy of the interface documentation formatted for HPFF use, including
                 a copy in the current document and web mark-up languages.
\end{itemize}

\section{HPF_CRAFT proposed interface}
%-----------------------------------------------------------------------------
% SPMD_HPF proposal, apparent date Jan 3, 1996
%-----------------------------------------------------------------------------



HPF_CRAFT is a hybrid language, combining an SPMD execution model with
HPF 2.0 features.  The model combines the multi-threaded execution of
HPF_LOCAL and the HPF 2.0 syntax and features.  The goal of HPF_CRAFT is
to attain the potential performance of an SPMD programming model with
access to HPF features and a well-defined extrinsic interface to HPF.

Informal desciption of interface goes here ...

Statement about platform and system availability goes here

Reference for full documentation, formal contact, and how to get
HPF_CRAFT goes here.

Next some short examples  (~2 pages)





%----------------------------------------------------------------------------
% HPF_SPMD proposal, apparent date Jan 29, 1996
% I'm unclear on the relation of this one with the previous section (chk)
%----------------------------------------------------------------------------


\documentstyle[twoside,11pt]{article}
%\input{psfig}
\pagestyle{myheadings}
%set dimensions of columns, gap between columns, and paragraph indent 
\setlength{\textheight}{8.75in}
\setlength{\textwidth}{5.5in}
\setlength{\footheight}{0.0in}
\setlength{\topmargin}{0.0in}
\setlength{\leftmargin}{0.0in}
\setlength{\headheight}{0.0in}
\setlength{\headsep}{0.25in}
\setlength{\oddsidemargin}{0.0in}
\setlength{\parindent}{1pc}
\setlength{\itemsep}{0.0in}
\setlength{\topsep}{0.0in}
\setlength{\parsep}{0.0in}
\setlength{\parskip}{.10in}

\markboth{Cray Research Inc.}{Cray Research Inc.}

% copied stuff out of art10.sty and modified them to conform to IEEE format.
\makeatletter
% as Latex considers descenders in its calculation of interline spacing,
%to get 12 point spacing for normalsize text, must set it to 10 points 
\def\@normalsize{\@setsize\normalsize{11pt}\xpt\@xpt
\abovedisplayskip 11pt plus2pt minus5pt\belowdisplayskip 
\abovedisplayskip \abovedisplayshortskip \z@ 
plus3pt\belowdisplayshortskip 6pt plus3pt 
minus3pt\let\@listi\@listI}

% need an 12 pt font size for subsection and abstract headings 
\def\subsize{\@setsize\subsize{12pt}\xipt\@xipt}

%make section titles bold and 12 point, 2 blank lines before, 1 after 
\def\section{\@startsection {section}{1}{\z@}{1.0ex plus 
1ex minus .2ex}{.2ex plus .2ex}{\large\bf}}

%make subsection titles bold and 11 point, 1 blank line before, 1 after 
\def\subsection{\@startsection {subsection}{2}{\z@}{.2ex 
plus 1ex} {.2ex plus .2ex}{\subsize\bf}}

\makeatother
\begin{document}
% don't want date printed
\date{}
%make title bold and 14 pt font (Latex default is non- bold, 16pt) 

\title{\Large\bf HPF\_SPMD}

\author{
   Andy Meltzer \\
   Cray Research, Inc. \\
   655F Lone Oak Drive \\
   Eagan, MN 55121
}
\maketitle

% I don't know why I have to reset thispagestyle, but 
% otherwise get page numbers 
\thispagestyle{empty}

\subsection*{\centering Abstract}
% IEEE allows italicized abstract
{\em
HPF\_SPMD is a hybrid language, combining an SPMD execution model with 
HPF Kernel features.
The model combines the multi-threaded execution of HPF\_LOCAL and the
HPF Kernel syntax and features.  The goal of HPF\_SPMD is to 
attain the potential performance of an SPMD programming model with 
access to HPF features and a well-defined extrinsic interface to HPF.
}

\section{Introduction\label{sec:intro}}

HPF\_SPMD is a hybrid language, combining an SPMD execution 
model with HPF Kernel features.
The model combines the multi-threaded execution of HPF\_LOCAL and the
HPF Kernel syntax and features.  The goal of HPF\_SPMD is to 
attain the potential performance of an SPMD programming model with 
access to HPF features and a well-defined extrinsic interface to HPF.
It is built on top of the HPF\_LOCAL extrinsic environment.
This language is based off of the current definition of the HPF Kernel and 
should change as the HPF Kernel changes.  

SPMD features and a multi-threaded model allow the user to take
advantage of the performance and opportunity for low level access of 
a more general purpose programming model.  Including the HPF Kernel data
distribution features gives the programmer access to the highest 
performing aspects of both models with the penalty of a somewhat 
more complex model.
HPF\_SPMD is not appropriate for all platforms, but is consistent with HPF
and easily targeted for platforms that have HPF and can support 
SPMD programming styles.

The syntax of HPF\_SPMD is a superset of the syntax of the HPF Kernel and the
extrinsic language's semantics are very similar to those of HPF. There are 
some differences, however. For example,
I/O causes differences; in HPF\_SPMD different
processors are allowed to read from different files at the same time, in HPF
the processors must all read from the same file.   The differences in the
models are principally caused by the multi-threaded execution model and
the introduction of HPF\_LOCAL data rules.

HPF\_SPMD allows for the notion of {\em private data}.  Data defaults
to a mapping in which data items are allocated so that each processor
has a unique copy.
The values of the individual
data items and the flow of control may vary from processor to
processor within HPF\_SPMD. This behavior is consistent with the behavior
of HPF\_LOCAL.  In HPF\_SPMD a processor
may be individually named and code executed based upon which processor
it is executing on.


\section{Execution Model\label{sec:exec-model}}

HPF\_SPMD is built upon the fundamental execution model of HPF\_LOCAL,
augmented with data mapping and work distribution features from the HPF Kernel.
It is also augmented with many explicit low-level control features,
some taken from Cray Research's CRAFT language.

In HPF\_SPMD there is a single task on each processor and 
all tasks begin executing in parallel, with data defaulting
to a private distribution, the same default distribution used in HPF\_LOCAL.
Each processor gets a copy of the data storage unless specified otherwise by 
the user.
Consequently I/O works identically to I/O in HPF\_LOCAL and 
message passing libraries are easily integrated.

In short, the execution model is that of HPF\_LOCAL.

To provide correct behavior when explicitly mapped data is involved, 
this model defines barrier points at which
conceptually all processors must stop and wait for the execution of all
other processors before they continue.  These barriers are an additional
constraint when compared to an HPF\_LOCAL program, but are only a small
subset of the implicit barriers in the comparable HPF 2.0 program.
An implementation may remove 
many of these barriers where they are deemed unnecessary, but EVERY 
processor must participate in the barriers at each one of these points.

In the following situations, the compiler automatically inserts implicit
barriers into a program:

\begin{itemize}
\item       At the end of many independent loops to ensure correctness
            when one processor may get ahead of others.
\item       When mapped stack variables are allocated; this includes
            when a subroutine is called the remaps data.
\item       At a {\tt SERIAL} or {\tt END SERIAL} directive.
\item       When the {\tt SYMMETRIC} directive is used (this directive
            is described in the ``Other Features" section.)
\item       At some array syntax statements to ensure correctness.
\end{itemize}

{\em Question: Should an HPF\_SPMD subprogram have access to global HPF data?}

\section{Data Mapping Features}

Data mapping feature syntax is identical to that
in the HPF Kernel.  The semantics of the data mapping directives is
also identical.  

The only difference (as mentioned above) is that
the default distribution is private so that values of the 
same data item on different 
processors may vary.  This is consistent with the HPF\_LOCAL 
interpretation of the data declaration.

When data is explicitly mapped, only one copy of the data storage
is created unless the explicit mapping directs otherwise.  The
value of explicitly mapped replicated data items must be consistent
between processors as is the case in the HPF Kernel.  

A new directive is suggested for completeness: {\tt PE\_PRIVATE}, which
specifies that the data should conform to the default behavior.


\section{Subprogram Interfaces}

\subsection{Calling an HPF\_SPMD Subprogram from HPF\_SPMD}

The behavior and requirements of an HPF\_SPMD program at subprogram 
interfaces is identical to that of the HPF Kernel for dummy arguments that
are explicitly mapped.  

All processors must co-operate in a subprogram
invocation that remaps or explicitly maps data.  In other words, if
an explicit interface is required (by the the HPF Kernel rules) or the 
subprogram declares explicitly mapped data, the subprogram must be called on
all processors.  Processors need not co-operate if there are only 
reads to non-local data.  The {\tt INHERIT} attribute can only be
applied to explicitly mapped data.

The behavior of an HPF\_SPMD subprogram at subprogram interfaces
is identical to that of HPF\_LOCAL for data that has the default private
mapping.  Data is passed individually on every processor and the
processors need not interact in any way.   All HPF arrays are logically
carved up into pieces; the HPF\_SPMD procedure executing on a 
particular physical processor sees an array containing just those
elements of the global array that are mapped to that physical 
processor.

When a subroutine is passed actual arguments that are a combination
of both explicitly mapped data and private data, the explicitly mapped
data follows the HPF rules and the private data follows the HPF\_LOCAL
rules.

The user also has the option of passing data with explicitly
mapped actual arguments to dummy arguments that are not explicitly 
mapped.  The mapping rules for this data are identical to the mapping
rules when HPF calls an HPF\_LOCAL routine.  The data remains ``in-place".

Finally, it is undefined for an actual argument to be private and
the dummy argument to be explicitly mapped.   A definition could 
be supplied for this interaction, but it is the same solution that 
one might propose for a calling sequence when HPF\_LOCAL routines call 
HPF routines.  

\subsection{Calling an HPF\_SPMD Subprogram from HPF 2.0}

The calling convention and argument passing rules for HPF\_SPMD are
a hybrid of those for HPF 2.0 calling HPF\_LOCAL and HPF 2.0 calling
HPF 2.0.  Explicit interfaces are required.  Where dummy arguments
are private (default) storage, the HPF calling HPF\_LOCAL 
conventions are used.  Where dummy arguments are explicitly mapped,
the calling convention matches HPF calling HPF.  

There are a number of constraints on the HPF\_SPMD subprograms
that may be called from HPF.  The following is a list of the restrictions
placed HPF\_SPMD subprograms called from HPF:
\begin{itemize}
\item Recursive subprograms cannot be called from HPF.
\item Subprograms containing alternate returns cannot be called from HPF.
\item An HPF\_SPMD routine may not be invoked directly or
      indirectly from within the body of a {\tt FORALL} construct or
      in the body of an {\tt INDEPENDENT DO} loop inside an HPF program.
\item Scalar dummy arguments in a routine called by HPF must be mapped 
      so that each processor has a copy of the argument.
\item The attributes (type, kind, rank, optional, intent) of the dummy
      arguments in a routine called by HPF
      must match the attributes of the corresponding dummy 
      arguments in the explicit interface.  A dummy argument of an HPF\_SPMD
      routine may not be a procedure name.
\item A dummy argument of an HPF\_SPMD routine called by HPF may not be
      a procedure name.
\item A dummy argument of an HPF\_SPMD routine called by HPF may not have
      the {\tt POINTER} attribute.
\item A dummy argument of an HPF\_SPMD routine called by HPF must be
      non-sequential.
\item A dummy argument of an HPF\_SPMD routine called by HPF must have
      assumed shape even when it is explicit shape in the interface.
\item The default mapping of scalar dummy arguments and of scalar function
      results when an HPF program calls an HPF\_SPMD routine is that it
      is replicated on each processor.
\end{itemize}

\subsubsection{Argument Association}

If a dummy argument of an {\tt EXTRINSIC(HPF\_SPMD)} routine is an
array and the dummy argument of the HPF\_SPMD routine has the default 
private mapping, then the corresponding dummy argument in the 
specification of the HPF\_SPMD procedure must be an array of the same 
rank, type, and type parameters.  When the extrinsic procedure is invoked, 
the dummy argument is associated with the local array that consists of the
subgrid of the global array that is stored locally if it has the
private mapping.  

If the dummy argument of the HPF\_SPMD routine
is explicitly mapped, it must have the same mapping as the dummy argument
of the {\tt EXTRINSIC(HPF\_SPMD)} routine.   Note that this restriction
does not require actual and dummy arguments to match and is no more stringent
than saying that mappings of dummy arguments in interface blocks must 
match those in the actual subprogram.


\subsubsection{Calling Sequence}

The actions detailed below must occur prior to the invocation of the 
SPMD\_HPF procedure on each processor.  These actions are the responsibility
of the compiler and happen automatically.

At the call site the following events occur:
\begin{enumerate}
\item The processors are synchronized.
\item Each actual argument is remapped, if necessary, according to the
      directives in the interface block.  Actual arguments corresponding
      to unmapped scalar dummy arguments are replicated.
\end{enumerate}

At the return of the subprogram, the following events occur:
\begin{enumerate}
\item All processors are synchronized.
\item The original distribution of arguments and results is restored if
      necessary.
\end{enumerate}

If the extrinsic procedure is a function, the the HPF\_SPMD procedure is 
also a function.  If the function result is mapped in the caller, the
function result must:
\begin{itemize}
\item also be explicitly mapped with the same mapping in the extrinsic 
      procedure.
\item or return the local part of the extrinsic function return value.
\end{itemize}
If the extrinsic function is scalar-valued then the implicit mapping
of the return value is replicated.  Thus, all HPF\_SPMD functions must
return the same value.


\subsection{Calling an HPF\_SPMD Subprogram from HPF\_LOCAL}

This document does not define the interface for calling HPF\_SPMD
subprograms from HPF\_LOCAL, since the definition hinges on the
way HPF\_LOCAL data is combined to form explicitly mapped distributions.
Once a calling convention is defined for HPF\_LOCAL calling HPF, this
should be obvious.

One can, however, define a calling convention when dummy arguments
are only comprised of private data.  In this case, the private data
is passed as if to another HPF\_LOCAL routine (in other
words, as if private data is passed to private data within HPF\_SPMD) 
and all is well defined.

\subsection{Calling an HPF Subprogram from HPF\_SPMD}

This document does not define this interface.  However, when passing
explicitly mapped data with an explicit interface the rules could be
defined to be as if an HPF\_SPMD program were calling an HPF\_SPMD
subprogram.  Passing private data would either cause undefined behavior
or need further definition.


\section{Executable Statements}
\subsection{The {\tt INDEPENDENT} directive}

The {\tt INDEPENDENT} directive is part of HPF\_SPMD with the same semantics
as in HPF 2.0.  However, within {\tt INDEPENDENT} loops
the values of private data may vary from processor to processor.

{\tt INDEPENDENT} applied to {\tt FORALL} has identical syntax and 
semantics as in HPF.

\subsection{The {\tt NEW} Clause}

An HPF independent loop optionally may have a {\tt NEW} clause. The {\tt NEW}
clause is not required by HPF\_SPMD for default (not explicitly mapped)
data. In HPF\_SPMD data defaults to
private so values may differ from processor to processor.

Private data has slightly different behavior than
data specified in the {\tt NEW} clause.  
The value of a private datum on each processor can be used beyond a single
iteration of the loop. 
Private data may be used to compute local sums, for example.  
The values of data items named in a {\tt NEW} clause
may not be used beyond a single iteration. The {\tt NEW} clause asserts that
the {\tt INDEPENDENT} directive would be valid if new objects were create for
the variables named in the clause for each iteration of the loop.

The semantics of the {\tt NEW} clause are identical in
HPF\_SPMD and HPF 2.0. The variables named in a {\tt NEW} 
clause apply only to the immediately subsequent loop nest.

The meaning of {\tt INDEPENDENT} when applied to loops with private data
changes slightly with respect to the private data.  The change can be
summarized to say that instead of indicating that iterations have no
dependencies upon one-another, with respect to the private data, iterations 
on different processors have no dependencies upon one-another.


\subsection{{\tt REDUCE}}

The {\tt REDUCE} direcitive has the same syntax and semantics as the
HPF Kernel {\tt REDUCE} directive.  Only explicitly mapped data may
be assigned to in a {\tt REDUCE} directive.


\subsection{Array Syntax}

Array syntax is treated identically in HPF\_SPMD as in HPF 2.0 for 
explicitly mapped objects.   
For private objects the behavior is 
identical to that of HPF\_LOCAL.   When private (default) objects and 
explicitly mapped objects are combined the rules are as follows:

Given:
\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\em result} = {\em rhs$_1$} op$_1$ {\em rhs$_2$} op$_2$ ... op$_m$ 
{\em rhs$_n$}\\
\end{tabbing}
\begin{itemize}
\item If {\em result} is explicitly mapped and all {\em rhs} arrays are 
      explicitly mapped, the work is distributed as in HPF.

\item If {\em result} is private and all {\em rhs} arrays are private the
      computation is done on all processors as an HPF\_LOCAL program
      would do it.

\item If {\em result} is private and all {\em rhs} arrays are explicitly 
      mapped, the 
      work is distributed as in HPF and the values of the results are 
      broadcast to the {\em result} on each processor.

\item If {\em result} is explicitly mapped and NOT all {\em rhs} arrays are 
      explicitly mapped, the results of the operation are undefined.
     
\item If {\em result} is private and some, but not all {\em rhs} arrays are 
      explicitly mapped, the value is computed by on each processor
      and saved to the local {\em result}.

\end{itemize}

For consistency, all processors must participate in any array syntax 
statement in which the value of an explicitly mapped array is modified.


\subsection{{\tt FORALL} Statement and Construct}

The {\tt FORALL} statement is treated exactly as in HPF when data is
explicitly mapped.  When data is private, the {\tt FORALL} is executed
separately on each processor.  Finally, when data in a {\tt FORALL} is
mixed, the rules for array syntax apply.   If any explicitly mapped
data item is modified in a {\em forall-stmt} then arrays in the 
{\em forall-header} must be explicitly mapped.  In a {\tt FORALL}
construct, if any explicitly mapped array is modified, all modified
arrays must be explicitly mapped.


\subsection{{\tt WHERE} Statement}
The syntactic rules for the {\tt WHERE} statement are similar to those
for the {\tt WHERE}.  
The {\tt WHERE} statement is treated exactly as in HPF when data is
explicitly mapped.  When data is private, the {\tt WHERE} is executed
separately on each processor.  Finally, when data in a {\tt FORALL} is
mixed, the rules for array syntax apply.   
If any explicitly mapped
data item is modified in a {\em where-stmt}  then arrays in the 
{\em where-header} must be explicitly mapped.  In a {\t WHERE} statement,
if any explicitly mapped array is modified, all modified
arrays must be explicitly mapped.


\section{Sequence and Storage Association}

Storage and sequence association rules are identical to the HPF Kernel for
explicitly mapped data.  Data that is private follows the rules
for ordinary Fortran 90 sequence and storage association.  This is
consistent with HPF\_LOCAL.

\section{Input and Output}

Private I/O in HPF has sequential semantics, private I/O in HPF\_SPMD
has parallel semantics; in other words, a private read in HPF\_SPMD
requires each processor to read each element of data in a given file, while a
private read in HPF requires a single read by one processor and a broadcast of
that value (where necessary) to all other processors. If the same file is
specified, both languages generate the same results (with great I/O
overhead in the HPF\_SPMD case). HPF\_SPMD allows each processor to open and
read from different files, a feature unavailable to HPF. Private writes
cause many more differences between the two languages, however.
The user must ensure that only one processor writes to a file using some sort
of synchronization in HPF\_SPMD.

\section{Serial Regions}

It is often useful to enter
a region where only one task is executing.  This is particularly 
useful for certain types of I/O.   To facilitate this, two directives
are provided:

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\tt SERIAL }\\
\> {\tt END SERIAL }\\
\end{tabbing}

In addition, one may optionally attach a {\tt COPY} clause to the 
{\tt END SERIAL} directive which specifies the private 
data items whose
values should be broadcast to all processors.  The syntax of this 
directive is:

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\tt !HPF\$     } \> {\tt SERIAL }\\
\>\>{\em sequential region} \\
\>\> ... \\
\> {\tt !HPF\$     } \> {\tt END SERIAL [, COPY (} {\em var$_1$}{\tt [} {\em, var$_2$, ... , var$_n$} {\tt ])]} \\
\end{tabbing}

where {\em var} is private data to be copied to the same named
private data on other processsors.

Serial regions provide implicit barrier synchronization points
at both the {\tt SERIAL} and {\tt END SERIAL} directives.  

Serial regions can be nested, but inner directives are ignored.  There
must be a matching, properly nested {\tt END SERIAL} directive for each
{\tt SERIAL} directive.

If a subroutine call occurs within a serial region, the subroutine
executes serially; there is no way to get back to parallel execution
within the subroutine.  All explicitly mapped data is accessible from 
within subroutines called in a serial region, but a subroutine called
from within a serial region cannot declare explicitly mapped data
or remap data.

All processors must participate in the invocation of the serial region.
No branches are allowed into or out of a SERIAL region.

\section{{\tt STOP} and {\tt ABORT}}

Because of the SPMD nature of the HPF\_SPMD routines, the behavior
of these statements must be defined within the context of this extrinsic.
The {\tt STOP} statement stops execution of only the task executing the
statement.  The {\tt ABORT} statement stops execution of all tasks.
If the {\tt STOP} statement is called from a serial region, all tasks
are stopped and the execution is complete.  With respect to the all/one
execution, {\tt EXIT} behaves like {\tt STOP}.


\section{Library and Intrinsic Routines}

\subsection{HPF Local Routine Library}

The HPF\_LOCAL extrinsic environment contains a number of libraries
that are useful for local SPMD programming and a number of libraries
that allow the user to determine global (rather than local) state
information.  These library procedures take as input the name of
a dummy argument and return information on the corresponding global
HPF actual argument.  They may only be invoked by an HPF\_SPMD
procedure that was directly invoked by global HPF code.  They may
be called only for private data.   The libraries reside in a module
called HPF\_LOCAL\_LIBRARY; an HPF\_SPMD routine that calls them should
include the statement
\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\tt USE HPF\_LOCAL\_LIBRARY }\\
\end{tabbing}
or some functionally appropriate variant thereof.


\subsection{HPF Library}

The HPF Library is available to HPF\_SPMD when called with data that is
explicitly mapped and all processors are participating in the call.
In addition, as in HPF\_LOCAL, the entire HPF Library is available for
use with private data.  Mixing private and explicitly mapped data in
calls to the HPF library produces undefined behavior.

\subsection{Parallel Inquiry Intrinsics}

These directives are provided as an extension to HPF.  They provide
information potentially useful to the programmer about the state of
execution in a program.

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\tt IN\_PARALLEL()} \\
\> {\tt IN\_INDEPENDENT()} \\
\end{tabbing}

\section{Task Identity}

{\tt PROCID} from HPF\_LOCAL is provided.  The physical processors are
identified by an integer in the range of 0 to {\em n-1} where {\em n}
is the value returned by the global HPF\_LIBRARY function 
{\tt NUMBER\_OF\_PROCESSORS}.  Processor identifiers are returned
by {\tt ABSTRACT\_TO\_PHYSICAL}, which establishes the one-to-one
correspondence between the abstract processors of an HPF processors
arrangement and the physical processors.  Also, the local library
function {\tt MY\_PROCESSOR} returns the identifier of the calling
processor.


\section{Synchronization Primitives}

It is suggested that a number of synchronization primitives be provided
since this model can be programmed at a much lower level than HPF 2.0.
These primitives include:

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> Barriers (test, set, wait)\\
\> Locks (set, wait, clear)\\
\> Critical Sections \\
\> Events (test, set, wait, clear)\\
\end{tabbing}

These primitives provide full SPMD programming model support to the 
HPF\_SPMD extrinsic environment.


\subsection{Barriers}

The following intrinsics are available to create and use a user defined
program barrier:
\begin{itemize}
\item {\tt SET\_BARRIER()}
\item {\tt WAIT\_BARRIER()}
\item {\tt TEST\_BARRIER()}
\end{itemize}

The {\tt SET\_BARRIER()} intrinsic indicates that the calling task has 
arrived at the barrier.  The {\tt WAIT\_BARRIER()} intrinsic suspends 
execution of the calling task until all of the other tasks have arrived 
at the barrier and called {\tt SET\_BARRIER()}.  The {\tt TEST\_BARRIER()}
intrinsic tests the state of the barrier, returning FALSE if the barrier 
is set and TRUE if all of the tasks have arrived.

In the following example, a barrier is used to make sure that {\em block3}
is not entered by any task until all tasks have completed execution of
{\em block1}.

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\em block1} \\
\> {\tt CALL SET\_BARRIER()} \\
\> {\em block2} \\
\> {\tt CALL WAIT\_BARRIER()} \\
\> {\em block3} \\
\end{tabbing}

\subsection{Locks}

Locks are used to prevent the simultaneous access of data by multiple
tasks.  The following three intrinsics are used:

\begin{itemize}
\item {\tt SET\_LOCK(}{\em lock}{\tt)} 
\item {\tt CLEAR\_LOCK(}{\em lock}{\tt)} 
\item {\tt TEST\_LOCK(}{\em lock}{\tt)}: 
\end{itemize}

The {\tt SET\_LOCK(}{\em lock}{\tt)} intrinsic sets the shared
value {\em lock} atomically.  If the lock is already set, the task 
that called {\tt SET\_LOCK} is suspended until the lock is cleared by 
another task and then sets it.
The {\tt CLEAR\_LOCK(}{\em lock}{\tt)} intrinsic clears {\em lock}.
After the call {\em lock} is cleared regardless of its state before the call.
The {\tt TEST\_LOCK(}{\em lock}{\tt)} intrinsic tests the value of {\em lock},
returning TRUE if the lock is set when {\tt TEST\_LOCK} is called
and FALSE if the lock is not set when called. {\tt TEST\_LOCK}
sets the lock before returning if it was not previously set (and
returns FALSE.)

\subsection{Critical Sections}

A {\em critical section} prohibits access to a section of code rather
than to a data object.  It is almost identical to a lock, but is 
implemented with a directive.

\begin{itemize}
\item {\tt !HPF\$ CRITICAL} 
\item {\tt !HPF\$ END\_CRITICAL}
\end{itemize}

The {\tt CRITICAL} directive marks the beginning of a code region in 
which only one task can enter at a time.  The {\tt END\_CRITICAL} directive 
marks the end of the critical section.

Every {\tt CRITICAL} directive must have a matching {\tt END\_CRITICAL}
directive in the same program unit.  They can be nested as long as there
are the same number of each directive; the inner directives have no 
effect.  Branching into or out of a critical section is not permitted.

The following example shows how a critical section might be used.

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\tt ! }{\em Compute} {\tt LOCAL\_SUM} {\em on every task.} \\
\> {\tt !HPF\$} \> {\tt CRITICAL} \\
\>\> {\tt GLOBAL\_SUM = GLOBAL\_SUM + LOCAL\_SUM} \\
\> {\tt !HPF\$} \> {\tt END\_CRITICAL} \\
\end{tabbing}

\subsection{Events}

Events are typically used to record the state of a program's execution
and to communicate that state to another task.  Because they do not set
locks, as do the lock routines described earlier, they cannot easily be
used to enforce serial access of data.  They are suited to work such as
signalling other tasks when a certain value has been located in a search
procedure.  There are four routines needed to perform the event functions.

\begin{itemize}
\item {\tt SET\_EVENT([}{\em event}{\tt ])} 
\item {\tt CLEAR\_EVENT([}{\em event}{\tt ])}
\item {\tt WAIT\_EVENT([}{\em event}{\tt ])}
\item {\tt TEST\_EVENT([}{\em event}{\tt ])}
\end{itemize}

{\em Event} is a shared integer variable.  If this argument is
present then the event routines use the named variable.  If it is not
then it defaults to a compiler generated single event.  Event setting
and clearing are NOT atomic operations, so code should be written carefully
with that in mind.

The {\tt SET\_EVENT} routine
sets or {\em posts} an event; it declares that an action has been
accomplished or a certain point in the program has been reached.  A
task can post an event at any time, whether the state of the event 
is cleared or already posted.
The {\tt CLEAR\_EVENT} routine clears and event.  The {\tt WAIT\_EVENT} 
routine suspends task execution until the specified event occurs.
The {\tt TEST\_EVENT} routine returns the state, either set (TRUE) or 
clear (FALSE) of an event.

\section{Other Useful Features}

We have found a number of other directives to be extremely
useful.  While these are not required by the model, we should 
consider them for inclusion.

\subsection{Barrier Removal}

It is occasionally useful for an advanced programmer to indicate 
to the compilation system where barriers may not be needed (even though the 
compiler might think that they are necessary,
based upon incomplete knowledge.)

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\tt NO BARRIER} \\
\end{tabbing}

\subsection{Parallelism Specification Directives}

These directives allow a user to assert that a routine will only be
called from within a parallel region, a serial region, or from within
both regions.  Without these directives an implementation might be
required to generate two versions of code for each subroutine, depending
upon implementation strategies.  The directives simply make the 
generated code size smaller and remove a test.

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\tt PARALLEL\_ONLY} \\
\> {\tt SERIAL\_ONLY} \\
\> {\tt PARALLEL\_AND\_SERIAL} \\
\end{tabbing}

\subsection{{\tt SYMMETRIC}}

{\tt SYMMETRIC} data is private data that is guaranteed to be at the
same storage location on every processor.  The feature is obviously
tied to certain implementations, but does make PUT and GET functionality
much easier to deal with.

\subsection{{\tt ON} clause}

In addition to the version of {\tt INDEPENDENT} available from HPF 2.0,
a new version of {\tt INDEPENDENT} is proposed that incorporates
the {\tt ON} clause and has a 
number of differences to more easily facilitate the use of the {\tt ON}
clause.  If a restricted
version of the current {\tt ON} proposal is adopted for the HPF Kernel,
that proposal should be adopted instead of this one.

{\tt INDEPENDENT} without the {\tt ON} clause is identical 
to the current HPF implementation of {\tt INDEPENDENT}.

The new version of the {\tt INDEPENDENT} directive in HPF\_SPMD may be 
applied to the first of a group of tightly nested loops and may apply 
to more than one of them. 
This more easily facilitates the use of the {\tt ON} clause. 
The current {\tt INDEPENDENT} directive applies only to a single loop nest.  

The {\tt INDEPENDENT} directive is extended so that multiple loop nests can
be named.

The general syntax for these independent loops is as follows:
\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\>{\tt !HPF\$} \> {\tt INDEPENDENT} ($I_1,I_2,\ldots,I_n$) {\tt ON} {\em array-name}($h_1(I_1),h_2(I_2),\ldots,h_n(I_n))$ \\
\>       \> {\tt DO} $I_1 = L_1, U_1, S_1$       \\
\>       \>\> {\tt DO} $I_2 = L_2, U_2, S_2$     \\
\>       \>\>\> {\tt DO} $I_n = L_n, U_n, S_n$ \\
\>       \>\>\>\> $\ldots$                              \\
\>       \>\>\> {\tt END DO}                            \\
\>       \>\> {\tt END DO}                              \\
\>       \> {\tt END DO}
\end{tabbing}


The syntax and semantics of {\tt INDEPENDENT} with the {\tt ON} 
clause are different from its syntax and semantics without the {\tt ON} 
clause. With the
{\tt ON} clause the directive states that there are no cross-processor
dependencies, but there may be dependencies between iterations on a
processor. It also indicates which loop iterations it refers to. 

If the {\tt ON} clause is used, {\tt INDEPENDENT} must be used 
in the multi-line form.

The rules for the  array specified by the {\tt ON}
clause are as follows.  
The iteration space of a {\tt INDEPENDENT} nest must be rectangular.  
That is, the lower loop bound, the upper loop bound, and the step 
expression for each loop indicated by the {\tt INDEPENDENT} induction 
list must be invariant with regard to the {\tt INDEPENDENT} nest.  
Triangular and trapezoidal nests, such as the following, are not allowed:

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\>{\tt !HPF\$}\>{\tt INDEPENDENT (I, J)  ON A(I,J)  ! Erroneous } \\
\>\>      {\tt DO I = 1, N                ! code} \\
\>\>\>      {\tt DO J = 1, I} \\
\>\>\>      {\tt ...} \\
\end{tabbing}

Each index expression of {\em array-name} in the
{\tt ON} clause (the functions {\em h$_i$} above,) 
must be of the form 

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\>{\bf [ }{\em a }{\tt * }{\tt loop\_control\_variable }{\tt + }{\bf ] }{\em b}.
\end{tabbing}
Where {\em a} and {\em b} must be integer values; they can be 
expressions, constants,
or variables. The value of {\em a} cannot be equal to 0.  The 
values of {\em a} and {\em b} must be
invariant with regard to the {\tt INDEPENDENT} loop nest.  

For example, specifying {\tt A(I,J,K)} is valid.  Specifying {\tt A(3,I+J,K)} 
is not valid.  Specifying {\tt A(I,I,K)} is not valid because I appears twice.

Division is prohibited in any index expression of the {\tt ON} clause.  
For example, specifying {\tt A(I/2,J,K)} is not valid.


\subsection{{\tt RESIDENT}}

The {\tt RESIDENT} directive can be applied to loops and at the
subroutine level.  It is an assertion that the accesses to a particular
variable in the subroutine (or loop) are only accesses to data that is
local to the processor making the assertion.  For example:

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> \> {\tt REAL A(100), B(100)} \\
\> {\tt !HPF\$     } \> {\tt DISTRIBUTE A(BLOCK), B(BLOCK) }\\
\> {\tt !HPF\$     } \> {\tt RESIDENT A, B }\\
\end{tabbing}

indicates that only local elements of arrays {\tt A} and {\tt B} will
be accessed within the subroutine.  
Note that this is an assertion about the behavior of a program and
not a directive to make it so.

\subsection{{\tt GEOMETRY}}

The {\tt GEOMETRY} directive is like a mapping typedef, allowing the
user to conveniently change the mappings of many arrays at the same
time.  It is similar in many ways to the {\tt TEMPLATE} directive, but
since it is bound to no particular extent it is easier to apply in a
general way.  Users of CRAFT tend to rely heavily on this feature to quickly
distribute a set of arrays similarly.

The syntax of the {\tt GEOMETRY} directive is:

\begin{tabbing}
---------\=---------\=---\=---\=---\=------------\=\kill
\> {\tt !HPF\$     } \> {\tt GEOMETRY} {\em geom}{\tt(}{\em$\delta_1$} {\tt[}{\em , $\delta_2$, ..., $\delta_n$}{\tt])} \\
\> {\tt !HPF\$     } \> {\tt DISTRIBUTE} {\em geom} [::] {\em var$_1${\tt[}, var$_2$, ... , var$_m$}{\tt]} \\
\end{tabbing}

Where $\delta_i$ indicates one of the allowable distribution formats.

\section{New Features of HPF\_SPMD}

HPF\_SPMD starts with the HPF\_LOCAL extrinsic environment then adds all
of the HPF Kernel.   This section lists the new features of HPF\_SPMD.  

\begin{itemize}
\item Suggested changes to {\tt INDEPENDENT} to better support {\tt ON}.
\item New rules defining the interaction of explicitly mapped and private
      data.
\item Parallel inquiry intrinsics:
  \begin{itemize}
  \item {\tt IN\_PARALLEL()}
  \item {\tt IN\_INDEPENDENT()}
  \end{itemize}
\item Serial regions ({\tt SERIAL / END SERIAL})
\item Explicit synchronization primitives:
  \begin{itemize}
  \item Locks (set, wait, clear)
  \item Critical Sections 
  \item Events (test, set, wait, clear)
  \item Barriers (test, set, wait)
  \end{itemize}
\item {\tt PE\_PRIVATE} directive to specify default data mapping behavior
\item Other suggested features:
  \begin{itemize}
  \item {\tt PARALLEL\_ONLY} 
  \item {\tt SERIAL\_ONLY} 
  \item {\tt PARALLEL\_AND\_SERIAL} 
  \item {\tt RESIDENT} 
  \item {\tt SYMMETRIC} 
  \item {\tt GEOMETRY}
  \end{itemize}
\end{itemize}


\end{document}




%-----------------------------------------------------------------------------
% HPF-to-MPI proposal, apparent date Jan 22, 1996
% I don't think this was passed, but am including it in the file
% anyway, there aren't many other choices where it would go...
%-----------------------------------------------------------------------------

Chuck:

Here it is ... I am not sure where this stands, i.e. whether
it is a p[roposal or just an informative briefing/.  But that
is for yopu to decide (sorry for the mistakes, I am typing
this over the Atlantic ...)

Ian.

\documentstyle[11pt,epsf]{article}

\textheight 23cm
\textwidth  16cm
\voffset = -0.8in
\hoffset = -0.5in

\title{An HPF Binding for MPI}

\author{Ian Foster}

\newcommand{\myepsfsize}[1]{\def\epsfsize##1##2{#1##1}}
\newcommand{\epsfigure}[3]{
        \begin{figure}
        \vspace{5mm}
        \begin{center}
        \leavevmode\epsfbox{#2.eps}
        \end{center}
        \vspace{-3ex}
        \vspace{1mm}
        \caption{\strut{#1}\label{#3}}
        \vspace{1mm}
        \end{figure}
}

\begin{document}
\maketitle

%-------------------------------------------------------------------------
\section{Motivation}

HPF extends F90 with syntax for specifying data-parallel execution
within a single logical thread of control, or task; the Message
Passing Interface (MPI) defines functions for specifying communication
between multiple tasks.  An HPF binding for MPI (``HPF/MPI'') would
provide a standard set of functions for coupling multiple HPF tasks to
form task-parallel computations.  This document uses two example
programs to show what such a binding would look like and how it could
be used.  It also discusses some relevant interface and implementation
issues.  The latter discussion builds on experience with a prototype
HPF/MPI implementation developed with Rakesh Krishnaiyer and Alok
Choudhary of Syracuse and David Kohr at Argonne, using the PGI HPF
compiler as a base.



%-------------------------------------------------------------------------
\section{Using an HPF Binding for MPI}

An HPF binding for MPI works as follows.  A programmer initiating a
computation requests (using some implementation-dependent mechanism)
that a certain number of tasks be created; each task executes a
specified HPF program on a specified number of processors.  Tasks can
call MPI functions to exchange data with other tasks, using either
point-to-point or collective communication operations.  When reading
programs, HPF directives can be ignored, and code understood as if it
implements a set of sequential tasks that communicate using MPI
functions.

We illustrate this basic model by presenting HPF/MPI implementations
of the 2-D FFT and 2-block Poisson solver task-parallel benchmarks
proposed by Jaspal Subhlok and Scott Baden, respectively.  The code in
Figure~\ref{fig-2dfft} implements the pipelined 2-D FFT.  As
illustrated in Figure~\ref{fig-2dfftf}, this program is designed to be
executed by two tasks.  The first task performs 1-D FFTs on each
column of a 2-D array; the modified array is then communicated to the
second task which performs 1-D FFTs on each row.  As both tasks
execute the same program, a runtime test is used to determine which
code is executed in each task.  An alternative structure would provide
two distinct programs, one containing the code executed by task 0 and
the other the code executed by task 1.  This structure would use half
as much storage, as arrays {\tt a} and {\tt b} would not be replicated
in the two tasks.

\begin{figure}
\begin{verbatim}
        program twodfft
        include 'mpihpf.h'
        integer nprocs, myid, ierr, status(MPI_STATUS_SIZE)
        integer i, j, k, ipow, N, LN, NITER
        parameter (N=256, NITER=100)
        complex a(N,N), b(N,N)
!HPF$ processors pr(Number_Of_Processors())
!HPF$ distribute a(*,BLOCK), b(BLOCK,*)

!       Initialize MPI and check that we have 2 tasks
        call MPI_Init(ierr)
        call MPI_Comm_size(MPI_COMM_WORLD, nprocs, ierr)
        if (nprocs .ne. 2) then
          print *,"2DFFT requires 2 tasks"
          stop 99
        endif

!       Determine which task am I: task 0 or task 1
        call MPI_Comm_rank(MPI_COMM_WORLD, myid, ierr)

        do k = 1, NITER
          if (myid .eq. 0) then   ! I am task 0: send to task 1
            forall(i=1:N, j=1:N) a(i,j) = (1.0,0.0)
            call colfft(N,a)
            call MPI_Send(a,N*N,MPI_COMPLEX,1,99,MPI_COMM_WORLD,ierr)
          else                    ! I am task 1: receive from task 0
            call MPI_Recv(b,N*N,MPI_COMPLEX,0,99,MPI_COMM_WORLD,
     $                    status,ierr)
            call rowfft(N,b)
          endif
        end do

!       Shut down MPI
        call MPI_Finalize(ierr)
        end
\end{verbatim}
\caption{HPF/MPI Implementation of 2-D FFT\label{fig-2dfft}}
\end{figure}

\myepsfsize{0.8}
\epsfigure{Operation of the HPF/MPI 2-D FFT code, in the case where four
processors are allocated to each of the two tasks.  The first task,
which performs 1-D FFTs on each column of the array {\tt a}
(distributed by column) performs {\tt MPI\_Send} operations on this
array.  The second task performs {\tt MPI\_Recv} operations into the
array {\tt b}, which is distributed by row.}{hpf_mpi}{fig-2dfftf}

The code in Figure~\ref{fig-2dp} implements the two-block Poisson
solver, which also executes as two tasks.  The tasks exchange boundary
data (using {\tt MPI\_Sendrecv}, run a solver on their block, and
compute a global error (using {\tt MPI\_Reduce}).

\begin{figure}
\small
\begin{verbatim}
        program block2
        include 'mpihpf.h'
!HPF$ processors pr(NUMBER_OF_PROCESSORS())
        integer, parameter :: n1=256, n2=128, MaxIter=100
        real, dimension(0:n1+1, 0:n1+1)     :: A, At
        real, dimension(n1:n1+n2+1, 0:n2+1) :: B, Bt
!HPF$ distribute A(*, BLOCK) onto pr
!HPF$ align At(i, j) with A(i, j)
!HPF$ distribute B(*, BLOCK) onto pr
!HPF$ align Bt(i, j) with B(i, j)
        intrinsic maxval
        double precision maxA, maxB, maxAB,  maxval
        integer nprocs, myid, ierr, status(MPI_STATUS_SIZE)

        call MPI_Init(ierr)
        call MPI_Comm_rank(MPI_COMM_WORLD, myid, ierr)

        if (myid .eq. 0) then
          A = 0.0
          A(1:n1, 1:n1) = 1.0
        else
          B = 0.0
          B(n1+1:n1+n2, 1:n2) = 1.0
        endif
        n1n2 = min(n1,n2)

        do iter = 1, Maxiter
          if (myid .eq. 0) then   ! Code executed by task 0
            call MPI_Sendrecv(A(n1,1:n1n2),   n1n2, MPI_REAL, 1, 99,
     &                        A(n1+1,1:n1n2), n1n2, MPI_REAL, 0, 99,
     &                        MPI_COMM_WORLD, status, ierr)
            At = A
            call Smooth(A, 1, n1, 1, n1)
            maxA = maxval(abs(A-At))
            call MPI_Reduce(maxA, maxAB, 0, MPI_REAL, MPI_MAX,
     &                      MPI_COMM_WORLD, ierr)
            print *, "Error = ",maxAB
          else                    ! Code executed by task 1
            call MPI_Sendrecv(B(n1+1,1:n1n2), n1n2, MPI_REAL, 0, 99,
     &                        B(n1,1:n1n2),   n1n2, MPI_REAL, 1, 99,
     &                        MPI_COMM_WORLD, status, ierr)
            Bt = B
            call Smooth(B, n1+1, n1+n2, 1, n2)
            maxB = maxval(abs(B-Bt))
            call MPI_Reduce(maxB, maxAB, 0, MPI_REAL, MPI_MAX,
     &                      MPI_COMM_WORLD, ierr)
          endif
        end do

        call MPI_Finalize(ierr)
        end
\end{verbatim}
\normalsize
\caption{HPF/MPI Implementation of Two-block Poisson Solver\label{fig-2dp}}
\end{figure}


%-------------------------------------------------------------------------
\section{Interface Issues}

The following are some of the many issues that must be addressed in
defining an HPF binding for MPI.  Most are also F90 binding issues.
(Thanks to Chuck Koelbel and Bill Saphir for discussions on these
topics.)

\begin{enumerate}
\item
An HPF binding for MPI should presumably be identical to (or perhaps a
slight generalization of) an F90 binding for MPI.  Unfortunately an
F90 binding has not yet been defined.  The existing F77 binding is not
adequate for F90 features such as array subsections and derived data
types.  The lack of sequence association in F90 also complicates
things.

\item
It would be nice to define an F90 binding as a module, however this is
problematic because F90 interface blocks cannot specify an equivalent
to the ``void *'' used in many MPI function interfaces.

\item
The passing of array sections to the MPI asynchronous communication
routines (Isend, Irecv, etc) can be problematic because an F90
compiler may copy on call and return.  (In fact, a compiler is allowed
to copy even full arrays, though no compilers do.)  A clear
description of what MPI assumes is required.

\item
Asynchronous operations can also be problematic if an HPF compiler
reorders statements so that a variable modified by an asynchronous
receive is read before it is set.  For example, if {\tt y~=~x} is
hoisted in the following:
\begin{verbatim}
     call MPI_Irecv(x, N, MPI_REAL, ..., request)
     ...
     call MPI_Wait(request)
     y = x
\end{verbatim}
This is a potential problem in the C or F77 bindings too, but an HPF
compiler has more incentive to reorder code.  We may require a
directive of the form ``this variable can change in this region.''

\item
The environment in which an HPF extrinsic executes needs to be defined
more precisely than at present.  If using MPI for low-level
communication, extrinsics no longer execute in {\tt MPI\_COMM\_WORLD}.
We could define an {\tt HPF\_COMM\_WORLD}, but probably a better
approach is to define an enquiry function that returns the appropriate
communicator.  (These functions could also return information about
the tasks that comprise the HPF/MPI computation.)

\item
Note that as MPI-2 defines mechanisms for dynamic process management
and remote data access, these same mechanisms can be incorporated into
an HPF binding for MPI.

\end{enumerate}



%-------------------------------------------------------------------------
\section{Implementation Issues}

We outline the techniques that might be used to implement an HPF
binding for MPI, and highlight problematic issues.

\begin{enumerate}
\item
An HPF implementation that is to support HPF/MPI must be modified to
create multiple tasks and initialize the data structures (e.g., MPI
communicators) used for intratask and intertask communications.  Note
that if an HPF implementation uses MPI as its underlying communication
layer, then intratask communication will now use a communicator other
than {\tt MPI\_COMM\_WORLD}.

\item
HPF/MPI functions will be implemented by a library invoked via HPF's
extrinsic function interface.  This library must determine how the
data to be communicated are distributed, and then uses low-level
communication mechanisms to transfer data.  Data distribution
information can be obtained via enquiry functions such as {\tt
HPF\_Distribution} (prior to calling an extrinsic) or by using HPF's
local routine library (within the extrinsic).

\item
The data transfer associated with an {\tt MPI\_Send/MPI\_Recv} pair
can be performed in a variety of ways, with different performance
tradeoffs.  A simple strategy is to gather the data on the sending
side (e.g., at sending processor 0), transfer it to the receiving side
in a single message (e.g., to receiving processor 0), and then perform
a scatter operation to redistribute it.  Alternatively, the sending
and receiving processors can first exchange data distribution
information and then perform the data transfer directly, without the
gather/scatter.  In the latter case, algorithms developed for array
redistribution can be used to compute an efficient communication
schedule.

\item
A programmer can use MPI's persistent send and receive operations
({\tt MPI\_Send\_init}, etc.) to indicate that the same communication
operation is to be performed many times (e.g., in a pipeline).  An
HPF/MPI implementation of these operations can improve performance by
caching and reusing a communication schedule.  (A difficulty is that
MPI's persistent communication operations do not define a full
channel.  An HPF binding for MPI may motivate an MPI-2 proposal for
full channel support.)

\end{enumerate}


\end{document}



%-------------------------------------------------------------------------
% again, I don't know the status of this proposal - calling HPF from 
% SPMD, discussed in November.  But here it is\ldots
%-------------------------------------------------------------------------

\documentstyle[11pt,fullpage]{article}

\pagestyle{plain}

\begin{document}


\newcommand{\PROCS}  {{\tt PROCESSORS} }
\newcommand{\HPFINIT}  {{\tt HPF\_INIT( )} }
\newcommand{\offset}[1]{\begin{center} \tt #1 \end{center}} 

%syntax-macs.tex
% Edited by Scott B. Baden for the purposes of this document

%Version of July 29, 1992 - Guy Steele, Thinking Machines
%constraint environment added Dec 15, 1992 - Chuck Koelbel, Rice
%change ``chapter'' to ``section'', ``appendix'' to ``annex''---GLS 3/15/93
%automatically number BNF rules---GLS 3/15/93
%put a line-numbering ruler in the gutter margins---GLS 3/15/93
%BNF rules generate labels of form ``lhs-rule''---GLS 3/16/93
%new environments rationale, implementors, users---GLS 3/16/93
%additional BNF hackery to support separate annex of glommed BNF---GLS 3/16/93
%new BNFcrossref environment, \Fortranrule---GLS 3/17/93
%\BNFpreface, \crossrefpreface---GLS 3/17/93
%Make sure no line of this file is longer than 80 characters.---GLS 3/23/93
%BNFcrossref* environment omits ``defined'' header.---GLS 3/30/93
%Make \subsubsection work and be numbered.---GLS 3/30/93
%\bnfruleprefix{...} added to allow change of BNF rule prefix to other 
%than "H".  Must appear after including these macros 
%E.g.\def\bnfruleprefix{J}  ---GLS 5/7/93
%Stuff for margin bars.  Surround paragraphs with environments ``obsolete'',
% ``new'', or ``newer''.  ---GLS 8/11/94

\newdimen\bnfalign         \bnfalign=2.1in
\newdimen\bnfopwidth       \bnfopwidth=.27in
\newdimen\bnfindent        \bnfindent=.2in
\newdimen\bnfsep           \bnfsep=6pt
\newdimen\bnfmargin        \bnfmargin=0in
\newdimen\bnfrulenumwidth  \bnfrulenumwidth=.45in
\newdimen\codemargin       \codemargin=1em
\newdimen\intrinsicmargin  \intrinsicmargin=3em
\newdimen\casemargin       \casemargin=0.75in
\newdimen\argumentmargin   \argumentmargin=1.8in

\def\IT{\it}
\def\RM{\rm}
\let\CHAR=\char
\let\CATCODE=\catcode
\let\DEF=\def
\let\GLOBAL=\global
\let\RELAX=\relax
\let\BEGIN=\begin
\let\END=\end


\def\FUNNYCHARACTIVE{\CATCODE`\a=13 \CATCODE`\b=13 \CATCODE`\c=13
                     \CATCODE`\d=13 \CATCODE`\e=13 \CATCODE`\f=13
                     \CATCODE`\g=13 \CATCODE`\h=13 \CATCODE`\i=13
                     \CATCODE`\j=13 \CATCODE`\k=13 \CATCODE`\l=13
                     \CATCODE`\m=13 \CATCODE`\n=13 \CATCODE`\o=13
                     \CATCODE`\p=13 \CATCODE`\q=13 \CATCODE`\r=13
                     \CATCODE`\s=13 \CATCODE`\t=13 \CATCODE`\u=13
                     \CATCODE`\v=13 \CATCODE`\w=13 \CATCODE`\x=13
                     \CATCODE`\y=13 \CATCODE`\z=13 \CATCODE`\[=13
                     \CATCODE`\]=13 \CATCODE`\-=13}

\def\RETURNACTIVE{\CATCODE`\
=13}

\makeatletter

\def\withlinenumbers{\relax
  \def\@oddfoot{\@marginfoot\hbox to 0pt{\hss\LineNumberRuler\hskip 1.5pc}\hfil}\relax
  \def\@evenfoot{\@marginfoot\hfil\hbox to 0pt{\hskip 1.5pc\LineNumberRuler\hss}}}

\def\LineNumberRuler{\vbox to 0pt{\vss\normalsize \baselineskip13.6pt
    \lineskip 1pt \normallineskip 1pt \def\baselinestretch{1}\relax
    \LNR{1}\LNR{2}\LNR{3}\LNR{4}\LNR{5}\LNR{6}\LNR{7}\LNR{8}\LNR{9}
    \LNR{10}\LNR{11}\LNR{12}\LNR{13}\LNR{14}
        \LNR{15}\LNR{16}\LNR{17}\LNR{18}\LNR{19}
    \LNR{20}\LNR{21}\LNR{22}\LNR{23}\LNR{24}
        \LNR{25}\LNR{26}\LNR{27}\LNR{28}\LNR{29}
    \LNR{30}\LNR{31}\LNR{32}\LNR{33}\LNR{34}\LNR{35}
        \LNR{36}\LNR{37}\LNR{38}\LNR{39}
    \LNR{40}\LNR{41}\LNR{42}\LNR{43}\LNR{44}
        \LNR{45}\LNR{46}\LNR{47}\LNR{48}
    \vskip 31pt}}
\def\LNR#1{\hbox to 1pc{\hfil\tiny#1\hfil}}

\def\ps@plainwithlinenumbers{\ps@plain\withlinenumbers}



\def\VECTOR#1{\relax
    \@ifnextchar,{\@MATRIXTABS{}#1,\@FOO, \hskip0pt plus 1filll
                                \penalty-1\@gobble
  }{\@ifnextchar;{\@MATRIXTABS{}#1,\@FOO; \hskip0pt plus 1filll
                                \penalty-1\@gobble
  }{\@ifnextchar:{\@MATRIXTABS{}#1,\@FOO: \hskip0pt plus 1filll
                                \penalty-1\@gobble
  }{\@ifnextchar.{\@MATRIXTABS{}#1,\@FOO. \hskip0pt plus 1filll
                                \penalty-1\@gobble
  }{\@MATRIXTABS{}#1,\@FOO{ }\hskip0pt plus 1filll\penalty-1}}}}}

\def\MATRIX#1{\relax
    \@ifnextchar,{\@MATRIXTABS{}#1,\@FOO, \hskip0pt plus 1filll
                                \penalty-1\@gobble
  }{\@ifnextchar;{\@MATRIXTABS{}#1,\@FOO; \hskip0pt plus 1filll
                                \penalty-1\@gobble
  }{\@ifnextchar:{\@MATRIXTABS{}#1,\@FOO: \hskip0pt plus 1filll
                                \penalty-1\@gobble
  }{\@ifnextchar.{\hfill\penalty1\null\penalty10000\hskip0pt plus 1filll
                  \@MATRIXTABS{}#1,\@FOO.\penalty-50\@gobble
  }{\@MATRIXTABS{}#1,\@FOO{ }\hskip0pt plus 1filll\penalty-1}}}}}

\def\@MATRIXTABS#1#2,{\@ifnextchar\@FOO{\@MATRIX{#1#2}}{\@MATRIXTABS{#1#2&}}}
\def\@MATRIX#1\@FOO{\(\left[{\tt\tabcolsep=.4em
            \begin{tabular}{rrrrrrrrrr}#1\end{tabular}}\right]\)}

\def\@IFSPACEORRETURNNEXT#1#2{\def\@tempa{#1}\def\@tempb{#2}\relax
        \futurelet\@tempc\@ifspnx}

\def\ADDA{\xdef\bnflabel{\bnflabel a}}
\def\ADDB{\xdef\bnflabel{\bnflabel b}}
\def\ADDC{\xdef\bnflabel{\bnflabel c}}
\def\ADDD{\xdef\bnflabel{\bnflabel d}}
\def\ADDE{\xdef\bnflabel{\bnflabel e}}
\def\ADDF{\xdef\bnflabel{\bnflabel f}}
\def\ADDG{\xdef\bnflabel{\bnflabel g}}
\def\ADDH{\xdef\bnflabel{\bnflabel h}}
\def\ADDI{\xdef\bnflabel{\bnflabel i}}
\def\ADDJ{\xdef\bnflabel{\bnflabel j}}
\def\ADDK{\xdef\bnflabel{\bnflabel k}}
\def\ADDL{\xdef\bnflabel{\bnflabel l}}
\def\ADDM{\xdef\bnflabel{\bnflabel m}}
\def\ADDN{\xdef\bnflabel{\bnflabel n}}
\def\ADDO{\xdef\bnflabel{\bnflabel o}}
\def\ADDP{\xdef\bnflabel{\bnflabel p}}
\def\ADDQ{\xdef\bnflabel{\bnflabel q}}
\def\ADDR{\xdef\bnflabel{\bnflabel r}}
\def\ADDS{\xdef\bnflabel{\bnflabel s}}
\def\ADDT{\xdef\bnflabel{\bnflabel t}}
\def\ADDU{\xdef\bnflabel{\bnflabel u}}
\def\ADDV{\xdef\bnflabel{\bnflabel v}}
\def\ADDW{\xdef\bnflabel{\bnflabel w}}
\def\ADDX{\xdef\bnflabel{\bnflabel x}}
\def\ADDY{\xdef\bnflabel{\bnflabel y}}
\def\ADDZ{\xdef\bnflabel{\bnflabel z}}
\def\ADDHYPHEN{\xdef\bnflabel{\bnflabel -}}

{
\FUNNYCHARACTIVE
\GLOBAL\DEF\FUNNYCHARDEF{\RELAX
    \DEFa{{\IT\CHAR"61}\ADDA}\DEFb{{\IT\CHAR"62}\ADDB}\RELAX
    \DEFc{{\IT\CHAR"63}\ADDC}\DEFd{{\IT\CHAR"64}\ADDD}\RELAX
    \DEFe{{\IT\CHAR"65}\ADDE}\DEFf{{\IT\CHAR"66}\ADDF}\RELAX
    \DEFg{{\IT\CHAR"67}\ADDG}\DEFh{{\IT\CHAR"68}\ADDH}\RELAX
    \DEFi{{\IT\CHAR"69}\ADDI}\DEFj{{\IT\CHAR"6A}\ADDJ}\RELAX
    \DEFk{{\IT\CHAR"6B}\ADDK}\DEFl{{\IT\CHAR"6C}\ADDL}\RELAX
    \DEFm{{\IT\CHAR"6D}\ADDM}\DEFn{{\IT\CHAR"6E}\ADDN}\RELAX
    \DEFo{{\IT\CHAR"6F}\ADDO}\DEFp{{\IT\CHAR"70}\ADDP}\RELAX
    \DEFq{{\IT\CHAR"71}\ADDQ}\DEFr{{\IT\CHAR"72}\ADDR}\RELAX
    \DEFs{{\IT\CHAR"73}\ADDS}\DEFt{{\IT\CHAR"74}\ADDT}\RELAX
    \DEFu{{\IT\CHAR"75}\ADDU}\DEFv{{\IT\CHAR"76}\ADDV}\RELAX
    \DEFw{{\IT\CHAR"77}\ADDW}\DEFx{{\IT\CHAR"78}\ADDX}\RELAX
    \DEFy{{\IT\CHAR"79}\ADDY}\DEFz{{\IT\CHAR"7A}\ADDZ}\RELAX
    \DEF[{{\RM\CHAR"5B}}\DEF]{{\RM\CHAR"5D}}\RELAX
    \DEF-{\@IFSPACEORRETURNNEXT{{\CHAR"2D}}{{\IT\CHAR"2D}\ADDHYPHEN}}}
}

%%% Warning!  Devious return-character machinations in the next several lines!
%%%           Don't even *breathe* on these macros!
{\RETURNACTIVE\global\def\RETURNDEF{\def
{\@ifnextchar\FNB{}{\@stopline\@ifnextchar
{\@NEWBNFRULE}{\penalty\@M\@startline\ignorespaces}}}}\global\def\@NEWBNFRULE
{\@NEWBNFGUTS}\global\def\@ifspnx{\ifx\@tempc\@sptoken\@DAxfer\else\ifx\@tempc
\let\@tempd\@tempa \else \let\@tempd\@tempb \fi\fi \@tempd}}
%%% End of bizarro return-character machinations.

\def\@DAxfer{\let\@tempd\@tempa}
\def\@NEWBNFGUTS{\xdef\bnflabel{}\vskip\bnfsep\@startline\ignorespaces}

\let\crossrefpreface=\relax


\def\Fortranrule#1#2{{\def\@currentlabel{#1}\label{#2-rule}}}

\begingroup \catcode `|=0 \catcode`\\=12
|gdef|@XCODE#1\EDOC{#1|endtrivlist|end{tt}}
|endgroup

\def\CODE{\begin{tt}\advance\@totalleftmargin\codemargin \@verbatim
   \def\@underbarchar{{\char"5F}}\frenchspacing \@vobeyspaces \@XCODE}
\def\ICODE{\begin{tt}\advance\@totalleftmargin\codemargin \@verbatim
   \def\@underbarchar{{\char"5F}}\frenchspacing \@vobeyspaces
   \xdef\bnflabel{}\relax
   \FUNNYCHARDEF\FUNNYCHARACTIVE \UNDERBARACTIVE\UNDERBARDEF \@XCODE}

\def\@underbarsub#1{{\ifmmode _{#1}\else {$_{#1}$}\fi}}
\let\@underbarchar\_
\def\@underbar{\let\@tempq\@underbarsub
  \if\@tempz A\let\@tempq\@underbarchar\fi
  \if\@tempz B\let\@tempq\@underbarchar\fi
  \if\@tempz C\let\@tempq\@underbarchar\fi
  \if\@tempz D\let\@tempq\@underbarchar\fi
  \if\@tempz E\let\@tempq\@underbarchar\fi
  \if\@tempz F\let\@tempq\@underbarchar\fi
  \if\@tempz G\let\@tempq\@underbarchar\fi
  \if\@tempz H\let\@tempq\@underbarchar\fi
  \if\@tempz I\let\@tempq\@underbarchar\fi
  \if\@tempz J\let\@tempq\@underbarchar\fi
  \if\@tempz K\let\@tempq\@underbarchar\fi
  \if\@tempz L\let\@tempq\@underbarchar\fi
  \if\@tempz M\let\@tempq\@underbarchar\fi
  \if\@tempz N\let\@tempq\@underbarchar\fi
  \if\@tempz O\let\@tempq\@underbarchar\fi
  \if\@tempz P\let\@tempq\@underbarchar\fi
  \if\@tempz Q\let\@tempq\@underbarchar\fi
  \if\@tempz R\let\@tempq\@underbarchar\fi
  \if\@tempz S\let\@tempq\@underbarchar\fi
  \if\@tempz T\let\@tempq\@underbarchar\fi
  \if\@tempz U\let\@tempq\@underbarchar\fi
  \if\@tempz V\let\@tempq\@underbarchar\fi
  \if\@tempz W\let\@tempq\@underbarchar\fi
  \if\@tempz X\let\@tempq\@underbarchar\fi
  \if\@tempz Y\let\@tempq\@underbarchar\fi
  \if\@tempz Z\let\@tempq\@underbarchar\fi\@tempq}
\def\@under{\futurelet\@tempz\@underbar}

\def\UNDERBARACTIVE{\CATCODE`\_=13}
\UNDERBARACTIVE
\def\UNDERBARDEF{\def_{\protect\@under}}
\UNDERBARDEF

\catcode`\$=11  

%the following line would allow derived-type component references 
%FOO%BAR in running text, but not allow LaTeX comments
%without this line, write FOO\%BAR
%\catcode`\%=11 


%Put this in the "standard header", say, right after the redefinitions
%of \section and \subsection.


\def\alternative#1 #2#3{\def\@tempa{#1}\def\@tempb{A}\ifx\@tempa\@tempb\else
    \expandafter\@altbumpdown\string#2\@foo\fi
    #2{Version #1: #3}}
\def\@altbumpdown#1#2\@foo{\global\expandafter\advance\csname c@#2\endcsname
        -1\relax}

\makeatother



%--------------------------------------
% constraints environment - nicely formats constraints
% usage:
% \begin{constraints}
% \item Thou shalt not use Courier font.
% \item Thou shalt not take HPF's name in vain.
% \end{constraints}
% produces:
% Constraint:  Thou shalt not use Courier font.
% Constraint:  Thou shalt not take HPF's name in vain.

\newenvironment{constraints}{
\begin{list}{Constraint:}{
\settowidth{\labelwidth}{Constraint:}
\settowidth{\labelsep}{w}
\settowidth{\leftmargin}{Constraint:w}
\setlength{\rightmargin}{0cm}
}
}{
\end{list}
}

%Guy Steele's version was:
%\def\constraints{\list{\hbox{Constraint:\quad
%       }}{\setbox0\hbox{Constraint:\quad}\leftmargin\wd0
%  \labelwidth\wd0 \labelsep0pt \let\makelabel\relax}}
%\let\endconstraints\endlist
%----------------------------------------


\newenvironment{rationale}{\begin{list}{}{}\item[]{\it Rationale.}
}{{\rm ({\it End of rationale.})} \end{list}}

\newenvironment{implementors}{\begin{list}{}{}\item[]{\it Advice
        to implementors.}
}{{\rm ({\it End of advice to implementors.})} \end{list}}

\newenvironment{users}{\begin{list}{}{}\item[]{\it Advice to users.}
}{{\rm ({\it End of advice to users.})} \end{list}}



%Guy Steele math macros used in Distribution Chapter of HPF Report:
\def\cmod{\mskip-\medmuskip\mkern5mu
  \mathbin{\rm cmod}\penalty900\mkern5mu\mskip-\medmuskip}
\def\cdiv{\mskip-\medmuskip\mkern5mu
  \mathbin{\rm cdiv}\penalty900\mkern5mu\mskip-\medmuskip}
\makeatletter


%%%%% Stuff for margin bars

\makeatletter

\def\@WhiteBar{\special{"  -3 0    moveto
                           3  0    lineto
                           3  -720 lineto
                           -3 -720 lineto
                           closepath
                           1 setgray
                           fill }}
\def\@BlackBar{\special{"  0 setlinecap
                           0.5 setlinewidth
                           0    0 moveto
                           0 -720 lineto
                           0 setgray
                           stroke }}

\def\@doublemargin{\hbox to 0pt{\@BlackBar\hskip1.5pt\@BlackBar\hss}}

\def\@marginhead{\hbox to 0pt{\hss\vbox to 0pt
  {\vskip\headsep\vskip3.75pt\currmargin\vss}\hskip8pt}\relax
  \global\let\currmargin=\nextmargin}
\def\@marginfoot{\hbox to 0pt{\hss\vbox to 0pt
  {\vskip-\footskip\vskip3.75pt\whitemargin\vss}\hskip8pt}}

\let\@margineverypar\relax
\let\@margineveryparguts\relax

\def\@new#1#2{\par\begingroup\let\@savemargin=\nextmargin
  \global\let\nextmargin=#2\relax
  \def\@margineveryparguts{\setbox1\hbox{\setbox0\hbox{X}\raise\ht0\hbox to 0pt{\hss#1\kern 8pt}}\relax
                \ht1=0pt\dp1=\dp\strutbox\box1}
  \def\@margineverypar{\strut\vadjust{\kern-\dp\strutbox
                \@margineveryparguts}\relax
            \let\@margineverypar\relax\everypar{}}
  \everypar{\@margineverypar}}

\def\new{\@new\startmargin\blackmargin}
\def\endnew{\par \@tempskipb\lastskip
  \@tempdima\prevdepth \ifdim 0pt>\@tempdima \@tempdima=-.001pt\fi
  \@tempdimb\@tempdima \ifdim \@tempdimb<0.75\dp\strutbox \@tempdimb=0.75\dp\strutbox\fi
  \nobreak\vskip-\@tempskipb\kern-\@tempdima\nointerlineskip
  \setbox1\hbox{\lower\@tempdimb\hbox to 0pt{\hss%\@MarginCap
  \@savemargin\hskip 8pt}}\ht1=0pt\dp1=\@tempdima\box1
  \vskip\@tempskipb
  \global\let\nextmargin=\@savemargin\endgroup}

\def\newer{\@new\@doublemargin\@doublemargin}
\let\endnewer=\endnew

\def\obsolete{\@new\@obsodots\@obsodots}
\def\@obsodots{\hbox to 0pt{\hss
        \vtop to 0pt{\leaders\vbox to 3.125pt{\hbox{\scriptsize .}
                     \nointerlineskip\vss}\vskip625pt\vss}\hss}}

\let\endobsolete=\endnew

     \let\blackmargin=\@BlackBar
                   \let\whitemargin=\@WhiteBar
     \let\startmargin=\@BlackBar
     \let\stopmargin=\@WhiteBar
                   \let\currmargin=\@WhiteBar
                   \let\nextmargin=\@WhiteBar

\if@twoside \def\ps@headings{\let\@mkboth\markboth
\def\@oddfoot{\@marginfoot\hfil}\def\@evenfoot{\@marginfoot\hfil}\def\@evenhead{\@marginhead\rm \thepage\hfil \sl
\leftmark}\def\@oddhead{\@marginhead\hbox{}\sl \rightmark \hfil
\rm\thepage}\def\chaptermark##1{\markboth {\uppercase{\ifnum \c@secnumdepth
>\m@ne
 \@chapapp\ \thechapter. \ \fi ##1}}{}}\def\sectionmark##1{\markright
{\uppercase{\ifnum \c@secnumdepth >\z@
 \thesection. \ \fi ##1}}}}
\else \def\ps@headings{\let\@mkboth\markboth
\def\@oddfoot{\@marginfoot\hfil}\def\@evenfoot{\@marginfoot\hfil}\def\@oddhead{\@marginhead\hbox {}\sl \rightmark \hfil
\rm\thepage}\def\chaptermark##1{\markright {\uppercase{\ifnum \c@secnumdepth
>\m@ne
 \@chapapp\ \thechapter. \ \fi ##1}}}}
\fi

\makeatother

\begin{center}
{\Large
{\bf
Proposal for calling HPF from SPMD\\
(Extrinsically coordinated HPF programs)\\[0.2in]
V 1.0, 10/3/95}}
\end{center}




\section {Introduction --- Proposal for calling HPF from SPMD}

In certain applications it is useful to handle some aspects
of coordination and data motion outside HPF.  To this end we
may write a coordination "harness," either in HPF local or in another
extrinsic language (e.g. C++), from which we invoke HPF routines
to carry out computation.  This document proposes a facility
which enables the extrinsic coordination of HPF programs, which
complements
the "HPF-to-SPMD" facility described in Annex A of the HPF spec.
In particular, we define an interface consisting of a set of
entries and a derived data type that enables SPMD code
written in HPF Local to communicate with HPF.
Interfaces for  other extrinsic languages,
e.g. C or C++ are out of the scope of this document;
the interested reader is referred to Andrew Meltzer's interoperability
document which treats the symmetric case of calling C from HPF.

More generally, the notion of external coordination (``external" in the
sense of being outside HPF) provides us with the capability of coordinating
two or more concurrently executing HPF programs.  When the programs
are identical, we have "multi data parallelism," which is useful
in managing multiblock applications.  When the HPF programs
are different, we have task parallelism.  The notion of processor
subsets, currently under examination by the HPF Forum, are relevant to
both cases.



\section{Terminology}

In specifying an interface to external coordination it is convenient
to borrow terminology from MPI.  However, this convention should
not be taken as a proscription to HPF compiler writers, but rather as
measure of expedience: the MPI terminology is well documented and
part of an emerging standard.  Thus, compiler writers are free
to  employ their own implementation strategy, in particular when doing
so leads to a more efficient implementation than if MPI were used
instead.
To wit, we borrow the following definitions from MPI: {\em communication domain,
process group,  rank, communicator,} and {\em intercommunicator.}
The following definitions are taken from the text ``MPI- The complete
Reference," by Snir et al, published by MIT Press.



\begin{itemize}

\item
A {\em group} is an ordered set of process identifiers (henceforth
processes): processes are implementation-dependent objects.

\item
Processes in a group are ordered and identified by an integer {\em rank.}

\item
A {\em communication domain} is a global distributed structure that allows
processes in a group to communicate with each other, or to communicate
with processes in another group.

\item
A {\em communicator} is an opaque object with a number of attributes...
[it] specifies a communication domain which can be used in point-
to-point communication
\end{itemize}


Informally, we identify an HPF \PROCS\  arrangement with  an MPI communicator.
The advantage of this approach is that the values returned by
{\tt NUMBER\_OF\_PROCESSORS} and {\tt PROCESSORS\_SHAPE} are well-defined.
Note that it is possible to dynamically change the members of a process
group or to change the group's topology by modifying
attributes of the
communicator.

\subsection{SPMD Execution }

We define SPMD execution in the following sense: all members
of a process group execute the same program in loosely synchronous
fashion, making calls to HPF and its interface {\em as if} at the same time.
We note that synchronization constraints on calls to HPF routines
or to the SPMD to HPF interface are more stringent than
other code, due to the requirement
that HPF code be portable over a wide range of communication models,
including shared memory.


\section{The Interface}

We first consider the case of a  single SPMD program calling HPF.
We will discuss multi data parallelism and task parallelism later on,
since these build on the simplest case of a single SPMD program.

\subsection{Mapped Array Allocation}

The functionality required to support external SPMD control of an HPF
program centers on the ability to allocate, from the SPMD level, mapped
HPF arrays with a specified \PROCS\ arrangement, alignment,
and decomposition.  This is nicely defined in Annex A of the
{\em HPF Spec, v 1.1},
which I adapt for the task at hand.  See the example on pages 166--7
of the HPF Spec for details.

\begin{enumerate}
\item All mapped HPF arrays accessible from an SPMD procedure are logically carved
up into pieces; the SPMD process executing on a particular physical
processor sees an array containing just those elements of the global
array that are mapped to that physical processor.

\item The model assumes that array axes are mapped independently to axes of a
rectangular PROCESSORS grid, each array axis to at most one PROCESSORS
axis (no ``skew" distributions) and no two array axes to the same
PROCESSORS axis.  This restriction suffices to ensure that each
physical processor contains a subset of array elements that can be
locally arranged in a rectangular configuration (of course to compute
the global indices of an element given its local indices or vice verse,
may be quite a tangled computation-- but it will be possible)
\end{enumerate}


Since manufacturers need flexibility  in representing a mapped array
(along with any affiliated state) we do not permit direct access
to the mapped array. Rather, we employ an opaque descriptor and an
interface that allows the SPMD program to access or query
the mapped array---e.g.
determine global shape, local shape, decomposition, and so on.

\begin{implementors}

Random access to the elements of a mapped array is likely to be
expensive, but aggregate (block) moves  should be efficient.


Implementations that employ ghost cells or shadow regions may need to provide
the capability of accessing such additional data.  This standard
does not specify how ghost cells are to be accessed or manipulated.

\end{implementors}


The general idea is that a process group will dynamically allocate a mapped
array  by making a "group" call to a routine called
{\tt create\_mapped\_array\_descriptor().}  This routine should should be called
with the same arguments as if at the same time:

\hspace{0.6in}{\tt create\_mapped\_array\_descriptor(MAD [, comm])}\\

\begin{tabular}{ll}
{\tt OUT MAD} &         Mapped array descriptor \\
{\tt IN  comm} &        The communicator
\end{tabular}
\vspace{0.1in}

\noindent
The first
argument is a result argument that is to receive the opaque
mapped array descriptor.  The second (optional) argument
enables the user to specify an alternative communicator as described
in the next section.


\subsection{HPF Run Time Initialization}

A call to a routine \HPFINIT\ is required to establish
various state needed by the HPF run time system.
\HPFINIT\ must   be passed a communicator corresponding
to the process group used to control the HPF program
(with task parallelism, we will have different
calls to \HPFINIT, one for each task)  We note that the
communicator encodes a default \PROCS\ topology to be employed
in subsequent allocations of mapped arrays.  If we wish to
override the default
\PROCS\ topology in any subsequent allocations, then we must pass
an alternative communicator to 
{\tt create\_mapped\_array\_descriptor( )}.
This communicator must include all members present in the default
\PROCS\ topology that was originally specified in the call to 
{\tt HPF\_INIT()}; it is an error to  specify a communicator
that is lacking any member of the process group.

\begin{implementors}

\HPFINIT\ should extract the \PROCS\ topology from the communicator
passed into it.

\end{implementors}


\subsection{Queries and Mapped Array Access}

These routines enable the user to access a mapped array using an interface
defined by HPF.  In effect these are "mirror images" of routines defined
in  Annex A of the HPF spec.


{\tt GLOBAL\_ALIGNMENT() }\\
{\tt GLOBAL\_DISTRIBUTION() }\\
{\tt GLOBAL\_TEMPLATE() }\\
{\tt GLOBAL\_LBOUND() }\\
{\tt GLOBAL\_UBOUND() }\\
{\tt GLOBAL\_SHAPE() }\\
{\tt GLOBAL\_SIZE() }\\
{\tt ABSTRACT\_TO\_PHYSICAL() }\\
{\tt PHYSICAL\_TO\_ABSTRACT() }\\
{\tt LOCAL\_TO\_GLOBAL() }\\
{\tt GLOBAL\_TO\_LOCAL() }\\
{\tt LOCAL\_BLKCNT() }\\
{\tt LOCAL\_LINDEX() }\\
{\tt LOCAL\_UINDEX() }


Note that we employ opaque mapped array descriptors in place of the
global HPF array arguments that the Annex specifies for these functions.


\begin{implementors}

It may be useful to  supply a {\tt copy( )} routine that copies
one mapped array into another.

\end{implementors}

To see how these routines are employed in practice we consider the
following example, taken from page 169 of Annex A, entitled
``Accessing Dummy Arguments by Blocks."

\begin{quote}
The mapping of a global HPF array to the physical processors places one
or more {\em blocks}, which are groups of elements with consecutive
indices, on each processor.   The number of blocks mapped to a
processor is the product of the number of blocks of consecutive indices
in each dimension that are mapped to it.  For example, a rank-one array
{\tt X} with a {\tt CYCLIC(4)} distribution will have blocks containing
four elements, except for a possible last block having \(1 + {\tt
SIZE(X)} \bmod 4\) elements.   On the other hand, if {\tt X} is first
aligned to a template or an array having a {\tt CYCLIC(4)}
distribution, and a non-unit stride is employed (as is {\tt !HPF$ ALIGN
X(I) WITH T(3*I)}), then its blocks may have fewer than four
elements.   In this case, when the align stride is three and the
template has a block-cyclic distribution with four template elements
per block, the blocks of {\tt X} have either one or two elements each.
If the align stride were five, then all blocks of {\tt X} would have
exactly one element, as template blocks to which no array element is
aligned are not counted in the reckoning of numbers of blocks.

The portion of a mapped array argument
in an HPF_LOCAL subprogram
associated with a global array dummy argument  in an HPF program
may be accessed in a block-by-block
fashion.  Three of the local library routines, {\tt LOCAL_BLKCNT}, {\tt
LOCAL_LINDEX}, and {\tt LOCAL_UINDEX}, allow easy access to the local
storage of a particular block.  Their use for this purpose is
illustrated by the following example, in which the local data are
initialized one block at a time:
\end{quote}

\CODE

        EXTRINSIC(HPF_LOCAL) SUBROUTINE NEWKI_DONT_HEBLOCK(X)
        USE MAPPED_ARRAY
        MAPPED_ARRAY X
        INTEGER BL(3)
        INTEGER, ALLOCATABLE LIND1(:), LIND2(:), LIND3(:)
        INTEGER, ALLOCATABLE UIND1(:), UIND2(:), UIND3(:)

        BL = LOCAL_BLKCNT(X)

        ALLOCATE LIND1(BL(1))
        ALLOCATE LIND2(BL(2))
        ALLOCATE LIND3(BL(3))

        ALLOCATE UIND1(BL(1))
        ALLOCATE UIND2(BL(2))
        ALLOCATE UIND3(BL(3))

        LIND1 = LOCAL_LINDEX(X, DIM = 1)
        UIND1 = LOCAL_UINDEX(X, DIM = 1)

        LIND2 = LOCAL_LINDEX(X, DIM = 2)
        UIND2 = LOCAL_UINDEX(X, DIM = 2)

        LIND3 = LOCAL_LINDEX(X, DIM = 3)
        UIND3 = LOCAL_UINDEX(X, DIM = 3)

        DO IB1 = 1, BL(1)
          DO IB2 = 1, BL(2)
            DO IB3 = 1, BL(3)
              FORALL (I1 = LIND1(IB1) : UIND1(IB1),  &
                      I2 = LIND2(IB2) : UIND2(IB2),  &
                      I3 = LIND3(IB3) : UIND3(IB3) ) &
                        X(I1, I2, I3) = IB1 + 10*IB2 + 100*IB3
            ENDDO
          ENDDO
        ENDDO
        END SUBROUTINE NEWKI_DONT_HEBLOCK
\EDOC


\subsection{Restrictions}


A process group collectively allocates a mapped array by making a call
to {\tt create\_mapped\_array\_descriptor(),} which returns an opaque mapped
array descriptor (MAD).  HPF routines are called by all members of the process
group as if at the same time.
MAD's are passed as arguments to an HPF routine.  All members
of the process group must pass the same MAD in each corresponding position
of the argument list.  As far as the HPF routine is concerned, it won't know
the origin of the argument, which could have in fact originated from
another HPF program.


More generally, we place two restrictions on the SPMD to HPF model:

\begin{enumerate}

\item The SPMD program is responsible for passing mapped arrays to an HPF
routine in conformance  with the routine's expectations, i.e. the distribution
must match and so on.  (The only exception is when the HPF routine
employs a transcriptive declaration.) This implies that we will
need to invoke HPF routines to explicitly redistribute data.


\item Actual scalar arguments must be consistent across the communicator.

\item Access to {\tt COMMON} blocks is restricted.  One possibility
is to restrict access to {\tt SEQUENTIAL} common blocks (Kernel HPF);
another is to restrict access to  only those elements stored
on the local process's memory.
\end{enumerate}

\section{Coordinating Multiple Communicating HPF Programs}

We may extend the SPMD to HPF model of the previous section to
enable coordination of multiple HPF programs---either
the same or different---each running on a different processor group.
Although we consider each case separately, there are certain
conventions and restrictions common to both, which we next discuss.

\subsection{Intergroup communication}

When we have a multiplicity of simultaneously executing
HPF programs running under the aegis of a single harness,
we associate each communicator and its underlying process group with a single
HPF program.
Thus, we need to make separate calls to \HPFINIT\, one for each group.
Each process group will manage one or more communicators to
specify allocation and execution over its communication domain.
Note that two communicators may or may not lie on disjoint sets of
processors, though communication between different communicators
occurs as if the communicators were on different sets of physical
processors.

This brings up some important points.
There can be several processor arrangements, with different ranks
and upper and lower bounds.
If a processor changed groups (more
likely, if it made two calls, one as a member of each communicator),
then \HPFINIT\ would have to be called again.
The key constraint is
that the HPF subroutine can declare exactly one \PROCS\ arrangement,
and the SPMD call would have to match this declaration.  The problem
is we can't pass a \PROCS\ arrangement as an argument
because these are static directives in HPF.

In order that 
the values returned by
{\tt NUMBER\_OF\_PROCESSORS} and {\tt PROCESSORS\_SHAPE} be well-defined,
all mapped array arguments passed to a single HPF routine must
have the same communication domain.  This implies that
data motion between mapped arrays allocated to different
process groups must be handled at the SPMD level.
How this is handled is up to the user; it may be done
through message passing or via a library.

\begin{implementors}

It will be useful to provide a routine to
copy a mapped
array in one process group into another mapped array allocated to a different
process group.  The calling sequence of this operation is as follows:\\

\hspace{0.6in}{\tt Copy\_mapped\_array(MAD\_SRC, MAD\_DEST)}\\[0.1in]

\begin{tabular}{r ll}
{\tt IN} & {\tt MAD\_SRC} &             Mapped array descriptor -- source \\
{\tt OUT} & {\tt  MAD\_DEST} &          Mapped array descriptor -- destination 
\end{tabular}


\noindent
There may also be a variant that copies over specified sections
of source and destination.


As with other interface entries, all processes must call these routines
as if at the same time.

\end{implementors}

\subsection{Examples}
\subsubsection{Multi Data Parallelism}

The application is to iteratively solve Poisson's equation on an irregularly
shaped domain, whose geometry is determined at run time.  This application
is relevant to implementing structured multilevel adaptive mesh methods, but
gives a simpler example which is easier to follow.
I'll discuss the 2d case, but there is an obvious 3d analog.


The geometry of the domain is restricted to shapes that
can be defined by a union of rectangles---of {\em different sizes}---that
don't overlap, but
which may touch.
Perhaps the domain is L-shaped,  perhaps something
more elaborate,
but the exact geometry isn't
important.
In fact,  no assumptions can be made about the geometry of the domain:
the number of rectangles, their relative placement, and their sizes
aren't known until run time. 

Because the rectangles have different we will
allocate them to different size \PROCS\ subsets in order to keep 
the workloads reasonably well balanced.  In the course of computation,
each rectangle must furnish boundary conditions that couple
it to the other rectangles.  This is an operation called {\tt
FILL\_PATCH( )}. The code for our example follows.

\CODE
        extrinsic(hpf_local) program iter
        use MAPPED_ARRAY
        MAPPED_ARRAY a, b

        logical converged
        real resid
        real function smooth
        real, parameter :: EPSILON = ...

!       determine the size and shapes of the blocks
!       set up the process groups and communicators, one for each block
!       this is spmd code, so we use the same name for each communicator
!       it is understood that communicators have been prearranged so that
!       each instance of the communicator variable refers to the correct
!       communication domain

        a = create_mapped_array_descriptor( real, m, n, BLOCK, BLOCK,
     &                                      .... communicator ... )
        b = create_mapped_array_descriptor( ... )

        converged = .false.
        do while ( .not. converged)

            fillpatch(a)
!       this a call to HPF (non-local)
            resid = smooth(a, b)

!       invoke MPI to do a global reduction over all processes
            _reduce_sum(resid, MPI_COMM_WORLD))
            converged = r .lt. epsilon
            _copy(b,a)
        enddo
        end program
\EDOC
\pagebreak
\subsubsection{Task Parallelism}
We use the 2D FFT example from the current Task Parallelism Proposal.

\CODE
        use mapped_array
        mapped_array a1, a2


!       ROW and COL are communicators for the two process groups

        a1 = create_mapped_array_descriptor( real, N, N, BLOCK, SERIAL,
     &                                      .... ROW ... )
        a2 = create_mapped_array_descriptor( REAL, N, N, SERIAL, BLOCK,
     &                                      .... COL ... )

        do while ( .true. )

            if _myGroup(ROW) then
!       Invoke an HPF routine to read in the data
                iend = ReadData(a1)
                if (.not. iend)
!       Invoke an HPF routine to do the row FFTs
                    call rowfft(a1)
                end if
            end if

!       Copy the end of file flag between communicators
            _copyScalar(iend,ROW,COL)
            if (iend) exit
            _copyMappedArray(a1,ROW,a2,COL)

            if _myGroup(COL) then
                call colfft(a2)
!       Invoke an HPF routine to write out the data
                call WriteData(a2)
            end if
        end do
\EDOC
\end{document}
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

