From owner-hpff-doc  Fri Sep  6 12:31:27 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id MAA15812 for hpff-doc-out; Fri, 6 Sep 1996 12:31:27 -0500 (CDT)
Date: Fri, 6 Sep 1996 12:31:27 -0500 (CDT)
Message-Id: <199609061731.MAA15812@cs.rice.edu>
From: offner@hpc.pko.dec.com (Carl Offner)
Subject: hpff-doc: mapping-subr.tex (again)
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
% File: mapping-subr.tex

% Contents:
% Mapping constructs for dummy arguments for HPF 2.0 document,
% including
%       interface rules
%       INHERIT
%
% If you don't have LaTeX2e available, uncomment the next three lines:
%\def\emph#1{{\em #1}}
%\def\texttt#1{{\tt #1}}
%\def\textit#1{{\it #1}}

% Revision history:
% Sep-04-96     Edit by Carl Offner, Digital.  Added forward reference
%               to some approved extensions that allow subroutines to
%               permanently modify data.  Also added new subsection
%               making it clear that explicit mapping directives are
%               characteristics of dummy arguments and function
%               results.
% Aug-13-96     Edit by Carl Offner, Digital.  Some clarifications and
%               corrections of typos, based on suggestions from Rob
%               Schreiber.
%
% Aug-01-96     Edit by Carl Offner, Digital.  Based on 
%               decisions taken at the July HPFF meeting,
%               i)  DISTRIBUTE is no longer allowed on INHERITed
%                   arguments.
%               ii) The section describing when an explicit interface
%                   is not needed when arguments are explicitly mapped
%                   has been vastly tightened up.
%               In addition, descriptive syntax has been more clearly
%               identified as being left in purely for backward
%               compatibility; and the TERPSICHORE-FRUG example has
%               been reworked.
%
% Jun-03-96     Edit by Carl Offner, Digital.
%               Rearrangements, many edits, and two new sections (an
%               introduction and the section on when an explicit
%               interface is necessary).
%
% May-10-96     Created by Charles Koelbel, Rice University
%               (from HPF 1.2 document and HPF 2.0 proposals)

\chapter{Data Mapping in Subprogram Interfaces}
\label{ch-mapping-subr}

{\em
Comments on this chapter should be directed to 
Piyush Mehrotra ({\tt pm@icase.edu}), 
Carl Offner ({\tt offner@hpc.pko.dec.com}), 
Guy Steele ({\tt Guy.Steele@east.sun.com}),
and \\
{\tt hpff-doc@cs.rice.edu}.
Please use ``{\tt Comments on Mapping in Subprogram Calls}'' as the
{\tt Subject:} line.
\par
}


\bigskip  %% Take this out in the final document.

\emph{In this chapter, phrases such as ``it is the responsibility of
  the called subprogram to remap the argument'' are constraints on the
  implementation (i.e., on the generated code produced by the
  compiler), not on the source code produced by the programmer.}



\section{Introduction}
\label{mapsub:introduction}

\emph{This introduction gives an overview of the ways in which mapping
directives interact with argument passing to subprograms.  The
language used here, however, is not definitive; the subsequent
sections of this chapter contain the authoritative rules.}

In addition to the data mapping features described in
Chapter~\ref{ch-mapping-base}, HPF allows a number of options for
describing the mapping of dummy arguments to subprograms.

The mapping of each such dummy argument may be related to the mapping
of its associated actual argument in the calling main program or
procedure (the ``caller'') in several different ways.  To allow for
this, mapping directives applied to dummy arguments can have three
different syntactic forms: \emph{prescriptive}, \emph{descriptive},
and \emph{transcriptive}.

HPF provides these three forms to allow the programmer either to
specify that the data is to be left in place, or to specify that
during the execution of the call the data must be automatically
remapped into a new and presumably more efficient mapping for the
duration of the execution of the called subprogram.

The meaning of these forms is as follows:

\begin{description}
  
\item[prescriptive] The directive describes the mapping of the dummy.
  However, the actual as passed may not have this mapping.  \emph{If
  it does not}, it is the responsibility of the called subprogram to
  remap the argument as specified, and to restore the original mapping
  on exit.  (However, see the discussion on explicit interfaces and
  remapping below.)

  Prescriptive directives are syntactically identical to directives
  occurring elsewhere in the program.  For instance, if \texttt{A} is
  a dummy argument,

                                                                \CODE
!HPF$ DISTRIBUTE A (BLOCK, CYCLIC)
                                                                \EDOC
  is a prescriptive directive.
  
\item[descriptive] The directive describes the mapping of the dummy.
  It is the responsibility of the caller to insure that the actual as
  passed has this mapping.  (Similarly, remapping to restore the
  original mapping on exit is also done by the caller.)

  Descriptive directives look like prescriptive directives, except
  that an asterisk precedes the description.  For instance,
                                                                \CODE
!HPF$ DISTRIBUTE A *(BLOCK, CYCLIC)
                                                                \EDOC
  is a descriptive directive.
  
\item[transcriptive] The mapping is unspecified.  The called
  subprogram must accept the mapping of the argument as it is passed.
  (Of course this means that the caller must pass this mapping
  information at run-time.)
  
  Transcriptive directives are written with a single asterisk for 
  distributions and processor arrays; for instance
                                                                \CODE
!HPF$ DISTRIBUTE A *
                                                                \EDOC
  is a transcriptive directive.  The \texttt{INHERIT} directive
  (see Section~\ref{INHERIT-SECTION}) is used to specify a transcriptive
  alignment.

\end{description}

Both distribution formats and processor arrangements can be specified
prescriptively, descriptively, or transcriptively.  Alignment is more
complicated, because of the need to specify the template with which
the dummy is aligned.  This template may be unspecified (in this case
of course there is no \texttt{ALIGN} directive), in which case it is
the \emph{natural template} of the dummy.  (``Natural template'' is
defined in Section~\ref{mapsub:templates} below.)  Otherwise, one of
the following disjoint possibilities must be true:

\begin{itemize}

\item The template is explicitly specified by a prescriptive ALIGN directive.

\item The template is explicitly specified by a descriptive ALIGN directive.
  
\item The template is \emph{inherited}.  This is specified by giving
  the dummy the \texttt{INHERIT} attribute (described in
  Section~\ref{INHERIT-SECTION} below).  This implicitly specifies the
  template to be a copy of the template with which the corresponding
  actual argument is ultimately aligned; further, the alignment of the
  dummy with that template is the same as that of the corresponding
  actual.  This is in effect a transcriptive form of alignment.

\end{itemize}

This is restated more precisely in Section~\ref{mapsub:templates}
below.

If remapping is necessary at a subprogram interface, this fact must be
visible in the caller.  This is a consequence of the requirements for
explicit interfaces in Section~\ref{mapsub:ExplicitInterfaces}.

As a result of this, an implementation may choose to have all
necessary remapping done by the caller.  This in effect eliminates the
distinction between prescriptive and descriptive mappings, and
eliminates the need for the descriptive form of the directives.

While both prescriptive and descriptive syntactic forms remain in this
document, and while they are each treated in this chapter, it is fair
to say that the descriptive syntax is at this point maintained only
for backwards compatibility.

\begin{users}
  Although it is possible to write some combinations of mapping
  directives that are partially prescriptive and partially
  transcriptive, for instance, there is probably no virtue to so
  doing.  The point of these directives is to enable the compiler to
  handle any necessary remapping correctly and efficiently.  Now
  remapping can happen for one or more of the following reasons:

  \begin{itemize}
  \item  to make the alignment of the actual and the dummy agree;
  \item  to make the distribution of the actual and the dummy agree;
  \item  to make the processor array of the actual and the dummy agree.
  \end{itemize}

  For most machines, there is no real difference in the cost of
  remapping for any of these reasons.  It is therefore a better
  practice in general to make a mapping either purely transcriptive,
  purely prescriptive, or purely descriptive.

  While transcriptive mappings can be useful in writing libraries,
  they impose a run-time cost on the subprogram.  They should
  therefore be avoided in normal user code.
\end{users}

\section{What Remapping is Required, and Who Does It}

If there is an explicit interface for the called subprogram and that
interface contains prescriptive or descriptive mapping directives for
a dummy argument, and if a remapping of the corresponding actual
argument is necessary, the call should proceed as if the data was
copied to a temporary variable to match the mapping of the dummy
argument as expressed by the directives in the explicit interface.
The template of the dummy will then be as declared in the interface.

If in such a case the mapping directives are descriptive, then it is
the responsibility of the caller to perform the necessary remapping.
That is, the caller must provide an actual argument matching such
directives, so that the descriptive directives correctly describe the
dummy arguments to the subprogram.

\begin{implementors}
In fact, an implementation may choose to have \emph{all} necessary
remapping done by the caller.
\end{implementors}

If there is no explicit interface, then no remapping will be
necessary; this is a consequence of the requirements in
Section~\ref{mapsub:ExplicitInterfaces}.

An overriding principle is that \emph{any remapping of arguments is
not visible to the caller}.  That is, when the subprogram returns and
the caller resumes execution, all objects accessible to the caller
after the call are mapped exactly as they were before the call.  It is
not possible for a procedure to change the mapping of any object in a
manner visible to its caller.

\begin{users}
Some Approved Extensions relax this restriction; see for instance
sections~\ref{DYNAMIC-DUMMY-SECTION} and \ref{POINTERS-SECTION}.
\end{users}

\section{Distributions and Processor Arrangements}
\label{mapsub:DistProcArr}

In a \texttt{DISTRIBUTE} directive where every \textit{distributee} is
a dummy argument, either the \textit{dist-format-clause} or the
\textit{dist-target}, or both, may begin with, or consist of, an
asterisk.

\begin{itemize}
  
\item Without an asterisk, a \textit{dist-format-clause} or
  \textit{dist-target} is prescriptive; the clause describes a
  distribution and constitutes a request of the language processor to
  make it so.  This might entail the subprogram remapping or copying
  the actual argument on entry at run time in order to satisfy the
  requested distribution for the dummy.
  
\item Starting with an asterisk, a \textit{dist-format-clause} or
  \textit{dist-target} is descriptive; the clause describes a
  distribution and constitutes an assertion to the language processor
  that on entry to the subprogram it will already be so.  The
  programmer claims that, for every call to the subprogram, the actual
  argument will be such that the stated distribution already describes
  the mapping of that data.  (The intent is that if the argument is
  passed by reference, no movement of the data \emph{by the
  subprogram} will be necessary at run time.  All this is under the
  assumption that the language processor has observed all other
  directives.  While a conforming HPF language processor is not
  required to obey mapping directives, it should handle descriptive
  directives with the understanding that their implied assertions are
  relative to this assumption.)

\item Consisting of only an asterisk, a \textit{dist-format-clause} or
  \textit{dist-target} is transcriptive; the clause says nothing about
  the distribution but constitutes a request of the language processor
  to copy that aspect of the distribution from that of the actual
  argument.  (The intent is that if the argument is passed by
  reference, no movement of the data will be necessary at run time.)
\end{itemize}

It is possible that, in a single \texttt{DISTRIBUTE} directive, the
\textit{dist-format-clause} might have an asterisk but not the
\textit{dist-target}, or vice versa.

\subsection{Examples}

These examples of \texttt{DISTRIBUTE} directives for dummy arguments
illustrate the various combinations:

                                                                        \CODE
!HPF$ DISTRIBUTE URANIA (CYCLIC) ONTO GALILEO
                                                                        \EDOC
The language processor should do whatever it takes to cause
\texttt{URANIA} to have a \texttt{CYCLIC} distribution on the
processor arrangement \texttt{GALILEO}.
                                                                        \CODE
!HPF$ DISTRIBUTE POLYHYMNIA * ONTO ELVIS
                                                                        \EDOC
The language processor should do whatever it takes to cause
\texttt{POLYHYMNIA} to be distributed onto the processor arrangement
\texttt{ELVIS}, using whatever distribution format it currently has
(which might be on some other processor arrangement).
                                                                        \CODE
!HPF$ DISTRIBUTE THALIA *(CYCLIC) ONTO FLIP
                                                                        \EDOC
The language processor should do whatever it takes to cause
\texttt{THALIA} to have a \texttt{CYCLIC} distribution on the
processor arrangement \texttt{FLIP}; \texttt{THALIA} already has a
cyclic distribution, though it might be on some other processor
arrangement.
                                                                        \CODE
!HPF$ DISTRIBUTE CALLIOPE (CYCLIC) ONTO *HOMER
                                                                        \EDOC
The language processor should do whatever it takes to cause
\texttt{CALLIOPE} to have a \texttt{CYCLIC} distribution on the
processor arrangement \texttt{HOMER}; \texttt{CALLIOPE} is already
distributed onto \texttt{HOMER}, though it might be with some other
distribution format.
                                                                        \CODE
!HPF$ DISTRIBUTE MELPOMENE * ONTO *EURIPIDES
                                                                        \EDOC
\texttt{MELPOMENE} is asserted to already be distributed onto
\texttt{EURIPIDES}; use whatever distribution format the actual
argument had so, if possible, no data movement should occur.
                                                                        \CODE
!HPF$ DISTRIBUTE CLIO *(CYCLIC) ONTO *HERODOTUS
                                                                        \EDOC
\texttt{CLIO} is asserted to already be distributed \texttt{CYCLIC}
onto \texttt{HERODOTUS} so, if possible, no data movement should
occur.
                                                                        \CODE
!HPF$ DISTRIBUTE EUTERPE (CYCLIC) ONTO *
                                                                        \EDOC
The language processor should do whatever it takes to cause
\texttt{EUTERPE} to have a \texttt{CYCLIC} distribution onto whatever
processor arrangement the actual was distributed onto.
                                                                        \CODE
!HPF$ DISTRIBUTE ERATO * ONTO *
                                                                        \EDOC
The mapping of \texttt{ERATO} should not be changed from that of the
actual argument.
                                                                        \CODE
!HPF$ DISTRIBUTE ARTHUR_MURRAY *(CYCLIC) ONTO *
                                                                        \EDOC
\texttt{ARTHUR_MURRAY} is asserted to already be distributed
\texttt{CYCLIC} onto whatever processor arrangement the actual
argument was distributed onto, and no data movement should occur.

Also note that \texttt{DISTRIBUTE ERATO * ONTO *} does not mean the
same thing as
                                                                        \CODE
!HPF$ DISTRIBUTE ERATO *(*) ONTO *
                                                                        \EDOC
This latter means:  \texttt{ERATO} is asserted to already be distributed
\texttt{*} (that is, on-processor) onto whatever processor arrangement the
actual was distributed onto.  The processor arrangement is necessarily
scalar in this case.

\subsection{What Happens When a Clause is Omitted}

One may omit either the \textit{dist-format-clause} or the
\textit{dist-onto-clause} for a dummy argument.  This is understood as
follows:

If the dummy argument has the \texttt{INHERIT} attribute (see
Section~\ref{INHERIT-SECTION}), then no distribution directive is
allowed in any case:  the distribution as well as the alignment is
inherited from the actual argument.

In any other case in which distribution information is omitted, the
compiler may choose the distribution format or a target processor
arrangement arbitrarily.

Here are two examples:
                                                                         \CODE
!HPF$ DISTRIBUTE WHEEL_OF_FORTUNE *(CYCLIC)
                                                                         \EDOC
\texttt{WHEEL_OF_FORTUNE} is asserted to already be \texttt{CYCLIC}.
As long as it is kept \texttt{CYCLIC}, it may be remapped onto some
other processor arrangement, but there is no reason to.
                                                                          \CODE
!HPF$ DISTRIBUTE ONTO *TV :: DAVID_LETTERMAN
                                                                          \EDOC
\texttt{DAVID_LETTERMAN} is asserted to already be distributed
on \texttt{TV} in some fashion.  The distribution format may be
changed as long as \texttt{DAVID_LETTERMAN} is kept on \texttt{TV}.
(Note that this declaration must be made in attributed form; the
statement form

                                                                        \CODE
!HPF$ DISTRIBUTE DAVID_LETTERMAN ONTO *TV         !Nonconforming
                                                                        \EDOC
does not conform to the syntax for a \texttt{DISTRIBUTE} directive.)




\section{Alignment}

\subsection{The Template of the Dummy Argument}
\label{mapsub:templates}

Here we describe precisely how the template with which the dummy
argument is ultimately aligned is arrived at.

First, templates are not passed through the subprogram argument
interface.  A dummy argument and its corresponding actual argument may
be aligned to the same template only if that template is accessible in
both the caller and the called subprogram either through host
association or use association.

In any other case, the template with which a dummy argument is aligned is
always distinct from the template with which the actual argument is
aligned, though it may be a copy (see Section \ref{INHERIT-SECTION}).
On exit from a procedure, an HPF implementation arranges that the
actual argument is aligned with the same template with which it was
aligned before the call.

The template of the dummy argument is arrived at in one of three ways:

\begin{itemize}
\item If the dummy argument appears explicitly as an \textit{alignee}
  in an \texttt{ALIGN} directive, its template is the
  \textit{align-target} if the \textit{align-target} is a template;
  otherwise its template is the template with which the
  \textit{align-target} is ultimately aligned.
  
\item If the dummy argument is not explicitly aligned and does not
  have the \texttt{INHERIT} attribute (described in
  Section~\ref{INHERIT-SECTION} below), then the template has the same
  shape and bounds as the dummy argument; this is called the
  \textit{natural template} for the dummy.

  (Thus, all the examples in Section~\ref{mapsub:DistProcArr} use
  the natural template.)

\item If the dummy argument is not explicitly aligned and does have
  the \texttt{INHERIT} attribute, then the template is ``inherited'' from
  the actual argument according to the following rules:

  \begin{itemize}
    
  \item If the actual argument is a whole array, the template of the
    dummy is a copy of the template with which the actual argument is
    ultimately aligned.
    
  \item If the actual argument is an array section of array \(A\)
    where no subscript is a vector subscript, then the template of the
    dummy is a copy of the template with which \(A\) is ultimately
    aligned.

  \item If the actual argument is any other expression, the shape
    and distribution of the template may be chosen arbitrarily by
    the language processor (and therefore the programmer cannot know
    anything \textit{a priori} about its distribution).

  \end{itemize}
  
  In all of these cases, we say that the dummy has an \textit{inherited
  template}.

\end{itemize}


\subsection{The INHERIT Directive}
\label{INHERIT-SECTION}

The \texttt{INHERIT} directive specifies that a dummy argument should be
aligned to a copy of the template of the corresponding actual argument
in the same way that the actual argument is aligned.

                                                                        \BNF
inherit-directive      \IS  INHERIT dummy-argument-name-list
                                                                        \FNB

The \texttt{INHERIT} directive causes the named subprogram dummy
arguments to have the \texttt{INHERIT} attribute.  Only dummy
arguments may have the \texttt{INHERIT} attribute.  An object must not
have both the \texttt{INHERIT} attribute and the \texttt{ALIGN}
attribute.  The \texttt{INHERIT} directive may appear only in a
\textit{specification-part} of a scoping unit.

If a dummy argument has the \texttt{TARGET} attribute and no explicit
mapping attributes, then the \texttt{INHERIT} attribute is implicitly
assumed.  (See section~\ref{mapsub:pointers}.)

The \texttt{INHERIT} attribute specifies that the template for a dummy
argument should be inherited, by making a copy of the template of the
actual argument.  Moreover, no other explicit mapping directive may
appear for an argument with the \texttt{INHERIT} attribute: the
\texttt{INHERIT} attribute implies a distribution of
\texttt{DISTRIBUTE~*~ONTO~*} for the inherited template.  Thus, the
net effect is to tell the compiler to leave the data exactly where it
is, and not attempt to remap the actual argument.  The dummy argument
will be mapped in exactly the same manner as the actual argument; the
subprogram must be compiled in such a way as to work correctly no
matter how the actual argument may be mapped onto abstract processors.

\subsubsection{Examples}

Here is a straightforward example of the use of \texttt{INHERIT}:

                                            \CODE

      REAL DOUGH(100)
!HPF$ DISTRIBUTE DOUGH(BLOCK(10))
      CALL PROBATE( DOUGH(7:23:2) )
      ...
      SUBROUTINE PROBATE(BREAD)
      REAL BREAD(9)
!HPF$ INHERIT BREAD
                                             \EDOC

The inherited template of \texttt{BREAD} has shape [100]; element 
\texttt{BREAD(I)} is aligned with element 5 + 2*I of the inherited
template, and that template has a \texttt{BLOCK(10)} distribution.


More complicated examples can easily be constructed.  It is important
to bear in mind that the inherited template may have a different rank
than the rank of the dummy, and it may even have a different rank than
the rank of the actual.  For instance, one might have a program
containing the following:

                                                \CODE
      REAL A(100,100)
!HPF$ TEMPLATE T(100,100,100)
!HPF$ DISTRIBUTE T(BLOCK,CYCLIC,*)
!HPF$ ALIGN A(I,J) with T(J,3,I)
      CALL SUBR(A(:,7))
      ...
      SUBROUTINE SUBR(D)
      REAL D(100)
!HPF$ INHERIT D
                                                 \EDOC

In this case, the dummy \texttt{D} has rank 1.  It corresponds to a
1-dimensional section of a 2-dimensional actual \texttt{A} which in
turn is aligned with a 2-dimensional section of a 3-dimensional
template \texttt{T}.  The template of \texttt{D} is a copy of this
three-dimensional template.  \texttt{D} is aligned with the section
\texttt{(7, 3, :)} of this inherited template.  Thus, the ``visible''
dimension of the dummy \texttt{D} is distributed \texttt{*}, although
if the call statement had been

                                                \CODE
      CALL SUBR(A(7,:))
                                                 \EDOC

\noindent
for instance, the ``visible'' dimension of the dummy would be
distributed \texttt{BLOCK}.


\subsection{ALIGN Directives}

The presence or absence of an asterisk at the start of an
\textit{align-spec} has the same meaning as in a
\textit{dist-format-clause}: it specifies whether the \texttt{ALIGN}
directive is descriptive or prescriptive, respectively.

If an \textit{align-spec} that does not begin with \texttt{*} is
applied to a dummy argument, the meaning is that the dummy argument
will be forced to have the specified alignment on entry to the
subprogram.  This may require either the caller or the subprogram to
temporarily remap the data of the actual argument or a copy thereof.

Note that a dummy argument may also be used as an \textit{align-target}.
                                                                        \CODE
      SUBROUTINE NICHOLAS(TSAR,CZAR)
      REAL, DIMENSION(1918) :: TSAR,CZAR
!HPF$ INHERIT :: TSAR
!HPF$ ALIGN WITH TSAR :: CZAR
                                                                        \EDOC

In this example the first dummy argument, \texttt{TSAR}, remains
aligned with the corresponding actual argument, while the second dummy
argument, \texttt{CZAR}, is forced to be aligned with the first dummy
argument.  If the two actual arguments are already aligned, no
remapping of the data will be required at run time.  If they are not,
some remapping will take place.

If the \textit{align-spec} begins with ``\texttt{*}'', then the
\textit{alignee} must be a dummy argument.  The ``\texttt{*}''
indicates that the \texttt{ALIGN} directive constitutes a guarantee on
the part of the programmer that, on entry to the subprogram, the
indicated alignment will already be satisfied by the dummy argument,
without any action to remap it required (on the part of the
subprogram) at run time.  For example:

                                                                        \CODE
      SUBROUTINE GRUNGE(PLUNGE,SPONGE)
      REAL PLUNGE(1000),SPONGE(1000)
!HPF$ INHERIT SPONGE
!HPF$ ALIGN PLUNGE WITH *SPONGE
                                                                        \EDOC

This asserts that, for every \texttt{J} in the range \texttt{1:1000},
on entry to subroutine \texttt{GRUNGE}, the directives in the program
have specified that \texttt{PLUNGE(J)} is currently mapped to the same
abstract processor as \texttt{SPONGE(J)}.

The intent is that if the language processor has in fact honored the
directives, then no interprocessor communication on the part of the
subprogram will be required to achieve the specified alignment.  As
usual, if an implementation performs all remapping in the caller, a
prescriptive form of the \texttt{ALIGN} directive would mean exactly
the same thing.

It is not permitted to say simply ``\texttt{ALIGN WITH *}''; an
\textit{align-target} must follow the asterisk.  (The proper way to
say ``accept any alignment'' is \texttt{INHERIT}.)

If a dummy argument has no explicit \texttt{ALIGN} or
\texttt{DISTRIBUTE} attribute, then the compiler provides an implicit
alignment and distribution specification, one that could have been
described explicitly without any ``assertion asterisks''.

\subsubsection{Example}

Without using \texttt{INHERIT}, explicit alignment of a dummy argument
may be necessary to insure that no remapping takes place at the
subprogram boundary.  Here is an example:

                                                                       \CODE
      LOGICAL FRUG(128)
!HPF$ PROCESSORS DANCE_FLOOR(16)
!HPF$ DISTRIBUTE (BLOCK) ONTO DANCE_FLOOR::FRUG
      CALL TERPSICHORE(FRUG(1:40:3))
                                                                        \EDOC

The array section \texttt{FRUG(1:40:3)} is mapped onto abstract
processors in the following manner:
\begin{center}
\setlength{\unitlength}{0.01in}
\begin{picture}(560,225)(0,0)
\put(0,200){\makebox(35,25){\small\rm 1}}
\put(35,200){\makebox(35,25){\small\rm 2}}
\put(70,200){\makebox(35,25){\small\rm 3}}
\put(105,200){\makebox(35,25){\small\rm 4}}
\put(140,200){\makebox(35,25){\small\rm 5}}
\put(175,200){\makebox(35,25){\small\rm 6}}
\put(210,200){\makebox(35,25){\small\rm 7}}
\put(245,200){\makebox(35,25){\small\rm 8}}
\put(280,200){\makebox(35,25){\small\rm 9}}
\put(315,200){\makebox(35,25){\small\rm 10}}
\put(350,200){\makebox(35,25){\small\rm 11}}
\put(385,200){\makebox(35,25){\small\rm 12}}
\put(420,200){\makebox(35,25){\small\rm 13}}
\put(455,200){\makebox(35,25){\small\rm 14}}
\put(490,200){\makebox(35,25){\small\rm 15}}
\put(525,200){\makebox(35,25){\small\rm 16}}
\thinlines
\multiput(0,25)(0,25){7}{\line(1,0){560}}
\thicklines
\multiput(0,0)(0,200){2}{\line(1,0){560}}
\multiput(0,0)(35,0){17}{\line(0,1){200}}
\put(0,175){\makebox(35,25){\tt 1}}
\put(0,100){\makebox(35,25){\tt 4}}
\put(0,25){\makebox(35,25){\tt 7}}
\put(35,150){\makebox(35,25){\tt 10}}
\put(35,75){\makebox(35,25){\tt 13}}
\put(35,0){\makebox(35,25){\tt 16}}
\put(70,125){\makebox(35,25){\tt 19}}
\put(70,50){\makebox(35,25){\tt 22}}
\put(105,175){\makebox(35,25){\tt 25}}
\put(105,100){\makebox(35,25){\tt 28}}
\put(105,25){\makebox(35,25){\tt 31}}
\put(140,150){\makebox(35,25){\tt 34}}
\put(140,75){\makebox(35,25){\tt 37}}
\put(140,0){\makebox(35,25){\tt 40}}
\end{picture}
\end{center}

Suppose first that the interface to the subroutine
\texttt{TERPSICHORE} looks like this:

                                                                        \CODE
      SUBROUTINE TERPSICHORE(FOXTROT)
      LOGICAL FOXTROT(:)
!HPF$ INHERIT FOXTROT
                                                                        \EDOC
The template of \texttt{FOXTROT} is a copy of the 128 element
template of the whole array \texttt{FRUG}.  The template is mapped like this:

\begin{center}
\setlength{\unitlength}{0.01in}
\begin{picture}(560,225)(0,0)
\put(0,200){\makebox(35,25){\small\rm 1}}
\put(35,200){\makebox(35,25){\small\rm 2}}
\put(70,200){\makebox(35,25){\small\rm 3}}
\put(105,200){\makebox(35,25){\small\rm 4}}
\put(140,200){\makebox(35,25){\small\rm 5}}
\put(175,200){\makebox(35,25){\small\rm 6}}
\put(210,200){\makebox(35,25){\small\rm 7}}
\put(245,200){\makebox(35,25){\small\rm 8}}
\put(280,200){\makebox(35,25){\small\rm 9}}
\put(315,200){\makebox(35,25){\small\rm 10}}
\put(350,200){\makebox(35,25){\small\rm 11}}
\put(385,200){\makebox(35,25){\small\rm 12}}
\put(420,200){\makebox(35,25){\small\rm 13}}
\put(455,200){\makebox(35,25){\small\rm 14}}
\put(490,200){\makebox(35,25){\small\rm 15}}
\put(525,200){\makebox(35,25){\small\rm 16}}
\thinlines
\multiput(0,25)(0,25){7}{\line(1,0){560}}
\thicklines
\multiput(0,0)(0,200){2}{\line(1,0){560}}
\multiput(0,0)(35,0){17}{\line(0,1){200}}
\put(0,175){\makebox(35,25){\tt 1}}
\put(0,150){\makebox(35,25){\tt 2}}
\put(0,125){\makebox(35,25){\tt 3}}
\put(0,100){\makebox(35,25){\tt 4}}
\put(0,75){\makebox(35,25){\tt 5}}
\put(0,50){\makebox(35,25){\tt 6}}
\put(0,25){\makebox(35,25){\tt 7}}
\put(0,0){\makebox(35,25){\tt 8}}
\put(35,175){\makebox(35,25){\tt 9}}
\put(35,150){\makebox(35,25){\tt 10}}
\put(35,125){\makebox(35,25){\tt 11}}
\put(35,100){\makebox(35,25){\tt 12}}
\put(35,75){\makebox(35,25){\tt 13}}
\put(35,50){\makebox(35,25){\tt 14}}
\put(35,25){\makebox(35,25){\tt 15}}
\put(35,0){\makebox(35,25){\tt 16}}
\put(70,175){\makebox(35,25){\tt 17}}
\put(70,150){\makebox(35,25){\tt 18}}
\put(70,125){\makebox(35,25){\tt 19}}
\put(70,100){\makebox(35,25){\tt 20}}
\put(70,75){\makebox(35,25){\tt 21}}
\put(70,50){\makebox(35,25){\tt 22}}
\put(70,25){\makebox(35,25){\tt 23}}
\put(70,0){\makebox(35,25){\tt 24}}
\put(105,175){\makebox(35,25){\tt 25}}
\put(105,150){\makebox(35,25){\tt 26}}
\put(105,125){\makebox(35,25){\tt 27}}
\put(105,100){\makebox(35,25){\tt 28}}
\put(105,75){\makebox(35,25){\tt 29}}
\put(105,50){\makebox(35,25){\tt 30}}
\put(105,25){\makebox(35,25){\tt 31}}
\put(105,0){\makebox(35,25){\tt 32}}
\put(140,175){\makebox(35,25){\tt 33}}
\put(140,150){\makebox(35,25){\tt 34}}
\put(140,125){\makebox(35,25){\tt 35}}
\put(140,100){\makebox(35,25){\tt 36}}
\put(140,75){\makebox(35,25){\tt 37}}
\put(140,50){\makebox(35,25){\tt 38}}
\put(140,25){\makebox(35,25){\tt 39}}
\put(140,0){\makebox(35,25){\tt 40}}
\put(175,175){\makebox(35,25){\tt 41}}
\put(175,150){\makebox(35,25){\tt 42}}
\put(175,125){\makebox(35,25){\tt 43}}
\put(175,100){\makebox(35,25){\tt 44}}
\put(175,75){\makebox(35,25){\tt 45}}
\put(175,50){\makebox(35,25){\tt 46}}
\put(175,25){\makebox(35,25){\tt 47}}
\put(175,0){\makebox(35,25){\tt 48}}
\put(210,175){\makebox(35,25){\tt 49}}
\put(210,150){\makebox(35,25){\tt 50}}
\put(210,125){\makebox(35,25){\tt 51}}
\put(210,100){\makebox(35,25){\tt 52}}
\put(210,75){\makebox(35,25){\tt 53}}
\put(210,50){\makebox(35,25){\tt 54}}
\put(210,25){\makebox(35,25){\tt 55}}
\put(210,0){\makebox(35,25){\tt 56}}
\put(245,175){\makebox(35,25){\tt 57}}
\put(245,150){\makebox(35,25){\tt 58}}
\put(245,125){\makebox(35,25){\tt 59}}
\put(245,100){\makebox(35,25){\tt 60}}
\put(245,75){\makebox(35,25){\tt 61}}
\put(245,50){\makebox(35,25){\tt 62}}
\put(245,25){\makebox(35,25){\tt 63}}
\put(245,0){\makebox(35,25){\tt 64}}
\put(280,175){\makebox(35,25){\tt 65}}
\put(280,150){\makebox(35,25){\tt 66}}
\put(280,125){\makebox(35,25){\tt 67}}
\put(280,100){\makebox(35,25){\tt 68}}
\put(280,75){\makebox(35,25){\tt 69}}
\put(280,50){\makebox(35,25){\tt 70}}
\put(280,25){\makebox(35,25){\tt 71}}
\put(280,0){\makebox(35,25){\tt 72}}
\put(315,175){\makebox(35,25){\tt 73}}
\put(315,150){\makebox(35,25){\tt 74}}
\put(315,125){\makebox(35,25){\tt 75}}
\put(315,100){\makebox(35,25){\tt 76}}
\put(315,75){\makebox(35,25){\tt 77}}
\put(315,50){\makebox(35,25){\tt 78}}
\put(315,25){\makebox(35,25){\tt 79}}
\put(315,0){\makebox(35,25){\tt 80}}
\put(350,175){\makebox(35,25){\tt 81}}
\put(350,150){\makebox(35,25){\tt 82}}
\put(350,125){\makebox(35,25){\tt 83}}
\put(350,100){\makebox(35,25){\tt 84}}
\put(350,75){\makebox(35,25){\tt 85}}
\put(350,50){\makebox(35,25){\tt 86}}
\put(350,25){\makebox(35,25){\tt 87}}
\put(350,0){\makebox(35,25){\tt 88}}
\put(385,175){\makebox(35,25){\tt 89}}
\put(385,150){\makebox(35,25){\tt 90}}
\put(385,125){\makebox(35,25){\tt 91}}
\put(385,100){\makebox(35,25){\tt 92}}
\put(385,75){\makebox(35,25){\tt 93}}
\put(385,50){\makebox(35,25){\tt 94}}
\put(385,25){\makebox(35,25){\tt 95}}
\put(385,0){\makebox(35,25){\tt 96}}
\put(420,175){\makebox(35,25){\tt 97}}
\put(420,150){\makebox(35,25){\tt 98}}
\put(420,125){\makebox(35,25){\tt 99}}
\put(420,100){\makebox(35,25){\tt 100}}
\put(420,75){\makebox(35,25){\tt 101}}
\put(420,50){\makebox(35,25){\tt 102}}
\put(420,25){\makebox(35,25){\tt 103}}
\put(420,0){\makebox(35,25){\tt 104}}
\put(455,175){\makebox(35,25){\tt 105}}
\put(455,150){\makebox(35,25){\tt 106}}
\put(455,125){\makebox(35,25){\tt 107}}
\put(455,100){\makebox(35,25){\tt 108}}
\put(455,75){\makebox(35,25){\tt 109}}
\put(455,50){\makebox(35,25){\tt 110}}
\put(455,25){\makebox(35,25){\tt 111}}
\put(455,0){\makebox(35,25){\tt 112}}
\put(490,175){\makebox(35,25){\tt 113}}
\put(490,150){\makebox(35,25){\tt 114}}
\put(490,125){\makebox(35,25){\tt 115}}
\put(490,100){\makebox(35,25){\tt 116}}
\put(490,75){\makebox(35,25){\tt 117}}
\put(490,50){\makebox(35,25){\tt 118}}
\put(490,25){\makebox(35,25){\tt 119}}
\put(490,0){\makebox(35,25){\tt 120}}
\put(525,175){\makebox(35,25){\tt 121}}
\put(525,150){\makebox(35,25){\tt 122}}
\put(525,125){\makebox(35,25){\tt 123}}
\put(525,100){\makebox(35,25){\tt 124}}
\put(525,75){\makebox(35,25){\tt 125}}
\put(525,50){\makebox(35,25){\tt 126}}
\put(525,25){\makebox(35,25){\tt 127}}
\put(525,0){\makebox(35,25){\tt 128}}
\end{picture}
\end{center}

\noindent
\texttt{FOXTROT(I)} is aligned with element 3*I-2 of the template.

Suppose on the other hand that the interface to \texttt{TERPSICHORE} looks
like this:

                                                                        \CODE
      SUBROUTINE TERPSICHORE(FOXTROT)
      LOGICAL FOXTROT(:)
!HPF$ DISTRIBUTE FOXTROT(BLOCK)
                                                                        \EDOC


In this case, the template of \texttt{FOXTROT} is its natural
template; it has the same size 14 as \texttt{FOXTROT} itself.  The
actual argument, \texttt{FRUG(1:40:3)} is mapped to the 16 processors
in this manner:

\begin{center}
\begin{tabular}{cc}
Abstract  &  Elements \\
processor & of FRUG \\
1 & 1, 2, 3 \\
2 & 4, 5, 6 \\
3 & 7, 8 \\
4 & 9, 10, 11 \\
5 & 12, 13, 14 \\
6--16   &  none
\end{tabular}
\end{center}

That is, the original positions (in the template of the actual
argument) of the elements of the dummy are as follows:

\begin{center}
\setlength{\unitlength}{0.01in}
\begin{picture}(560,225)(0,0)
\put(0,200){\makebox(35,25){\small\rm 1}}
\put(35,200){\makebox(35,25){\small\rm 2}}
\put(70,200){\makebox(35,25){\small\rm 3}}
\put(105,200){\makebox(35,25){\small\rm 4}}
\put(140,200){\makebox(35,25){\small\rm 5}}
\put(175,200){\makebox(35,25){\small\rm 6}}
\put(210,200){\makebox(35,25){\small\rm 7}}
\put(245,200){\makebox(35,25){\small\rm 8}}
\put(280,200){\makebox(35,25){\small\rm 9}}
\put(315,200){\makebox(35,25){\small\rm 10}}
\put(350,200){\makebox(35,25){\small\rm 11}}
\put(385,200){\makebox(35,25){\small\rm 12}}
\put(420,200){\makebox(35,25){\small\rm 13}}
\put(455,200){\makebox(35,25){\small\rm 14}}
\put(490,200){\makebox(35,25){\small\rm 15}}
\put(525,200){\makebox(35,25){\small\rm 16}}
\thinlines
\multiput(0,25)(0,25){7}{\line(1,0){560}}
\thicklines
\multiput(0,0)(0,200){2}{\line(1,0){560}}
\multiput(0,0)(35,0){17}{\line(0,1){200}}
\put(0,175){\makebox(35,25){\tt 1}}
\put(0,100){\makebox(35,25){\tt 2}}
\put(0,25){\makebox(35,25){\tt 3}}
\put(35,150){\makebox(35,25){\tt 4}}
\put(35,75){\makebox(35,25){\tt 5}}
\put(35,0){\makebox(35,25){\tt 6}}
\put(70,125){\makebox(35,25){\tt 7}}
\put(70,50){\makebox(35,25){\tt 8}}
\put(105,175){\makebox(35,25){\tt 9}}
\put(105,100){\makebox(35,25){\tt 10}}
\put(105,25){\makebox(35,25){\tt 11}}
\put(140,150){\makebox(35,25){\tt 12}}
\put(140,75){\makebox(35,25){\tt 13}}
\put(140,0){\makebox(35,25){\tt 14}}
\end{picture}
\end{center}

This layout (3 elements on the first processor, 3 on the second, 2 on
the third, 3 on the fourth, \dots) cannot properly be described as a
\texttt{BLOCK} distribution.  Therefore, remapping will take place at
the call.

Remapping can be avoided without using \texttt{INHERIT} by explicitly
aligning the dummy to a declared template of size 128 distributed
\texttt{BLOCK}:

                                                                        \CODE
      SUBROUTINE TERPSICHORE(FOXTROT)
      LOGICAL FOXTROT(:)
!HPF$ PROCESSORS DANCE_FLOOR(16)
!HPF$ TEMPLATE, DISTRIBUTE(BLOCK) ONTO DANCE_FLOOR::GURF(128)
!HPF$ ALIGN FOXTROT(J) WITH GURF(3*J-2)
                                                                        \EDOC

\begin{users}
  The advantage of this technique is that, where it can be used, it
  gives the compiler more information; this information can often be
  used to generate more efficient code.
\end{users}

\section{Explicit Interfaces}
\label{mapsub:ExplicitInterfaces}

An explicit interface is required \emph{except} when all four of the
following conditions hold:

\begin{enumerate}

\item  Fortran does not require one, \emph{and}
  
\item No dummy argument is passed transcriptively or with the
  \texttt{INHERIT} attribute, \emph{and}

\item For each pair of corresponding actual and dummy arguments, either:

  \begin{enumerate}

  \item They are both implicitly mapped, or

  \item They are both explicitly mapped and the conditions in
    Section~\ref{mapsub:NoExplicitInterface} are satisfied.

  \end{enumerate}

  \emph{and}

\item For each pair of corresponding actual and dummy arguments, either:

  \begin{enumerate}

  \item Both are sequential, or

  \item Both are nonsequential.

  \end{enumerate}

\end{enumerate}

\begin{rationale}
  This has the following consequences:

  \begin{itemize}

  \item A plain Fortran program (i.e., with no HPF directives) will
    continue to be legal without the need to add additional interfaces.
    This is insured by items 1, 2, 3a, and 4a.
  
  \item If remapping is necessary, this fact will be visible to the
    caller.  Thus the implementation may choose to have all remapping
    performed by the caller.

  \end{itemize}
\end{rationale}

\begin{users}
  This requirement pushes the user strongly in the direction of always
  providing explicit interfaces.  This is a good thing---explicit
  interfaces allow many errors to be caught at compile-time and greatly
  speed up the process of robust software development.

  Note, that an explicit interface can be provided in three ways:

  \begin{enumerate}

  \item Module subprograms have an explicit interface.

  \item Contained subprograms have an explicit interface.

  \item An explicit interface may be provided by an interface block.

  \end{enumerate}

  In addition, an intrinsic procedure always has an explicit interface
  by definition.

  The idiomatic Fortran way of programming makes extensive use of
  modules; every subprogram, for instance, can be in a module.  This
  provides explicit interfaces automatically, with no extra effort on
  the part of the programmer.  It should very seldom be necessary to
  write an interface block.
\end{users}

\subsection{Characteristics of Procedures}

The characteristics of dummy data objects and function results as
given in Section 12.2.1.1 of the Fortran standard are extended to
include also the \emph{hpf-characteristics} of such objects, which
are defined recursively as follows:

\begin{itemize}
\item A processor arrangement has one hpf-characteristic: its shape.

\item A template has up to three hpf-characteristics:

\begin{enumerate}
\item its shape;
\item its distribution, if explicitly stated;
\item the hpf-characteristic (i.e., the shape) of the processor
  arrangement onto which it is distributed, if explicitly stated.
\end{enumerate}

\item A dummy data object has the following hpf-characteristics:

\begin{enumerate}
\item its alignment, if explicitly stated, as well as all
  hpf-characteristics of its align target;
\item its distribution, if explicitly stated, as well as the
  hpf-characteristic (i.e., the shape) of the processor arrangement
  onto which it is distributed, if explicitly stated.
\end{enumerate}

\item A function result has the same hpf-characteristics as a dummy
  data object.  Specifically, it has the following
  hpf-characteristics:

\begin{enumerate}
\item its alignment, if explicitly stated, as well as all
  hpf-characteristics of its align target;
\item its distribution, if explicitly stated, as well as the
  hpf-characteristic (i.e., the shape) of the processor arrangement
  onto which it is distributed, if explicitly stated.
\end{enumerate}

\end{itemize}

\begin{rationale}
  In case an explicit interface is given by an interface block, the
  Fortran standard specifies what information must be specified in
  that interface block; it does this using the concept of a Fortran
  \emph{characteristic}.  Characteristics of dummy data objects, for
  instance, include their types.  Characteristics must be specified in
  interface blocks; Section 12.3.2.1 of the Fortran standard states

  \begin{quote}
    An interface body specifies all of the procedure's characteristics and
    these shall be consistent with those specified in the procedure
    definition\dots
  \end{quote}

  Normally, an interface block for a procedure is a textual copy of the
  appropriate declarations of that procedure.  This section simply says
  that such a textual copy must include any explicit mapping directives
  relevant to dummy arguments of the procedure.
\end{rationale}

\subsection{Explicit Mappings Without an Explicit Interface}
\label{mapsub:NoExplicitInterface}

When there is no explicit interface, the actual and dummy arguments
may still be explicitly mapped (but not transcriptively or with the
\texttt{INHERIT} attribute), as long as the conditions in this section
are satisfied.

\begin{users}
  These conditions are complex.  The important thing to realize is
  that you don't have to read any of this if you have an explicit
  interface.  So if there is any doubt in your mind, just make sure
  you have an explicit interface.
\end{users}

\begin{enumerate}
  
\item The actual argument must be a named object.  (So for instance,
  the actual argument cannot be an array section.)
  
\item The shapes of the ultimate align targets of the dummy and the
  actual must be the same.

\item Either

  \begin{enumerate}
  \item the ultimate align targets of the dummy and actual must not
    have been explicitly distributed, or
    
  \item the \texttt{DISTRIBUTE} directives specified for both ultimate
    align targets must

    \begin{enumerate}
    \item have a \textit{dist-onto-clause} specifying processor
      arrangements with the same shape; and
      
    \item specify a \textit{dist-format-clause} with the same
      \textit{dist-format}s in corresponding positions, except that a
      \textit{dist-format} of \texttt{BLOCK} is regarded as the same
      as a \textit{dist-format} of \texttt{BLOCK(\(n\))} if the the
      actual blocking size of the \texttt{BLOCK} distribution has the
      same value as \(n\); and the same for \texttt{CYCLIC} and
      \texttt{CYCLIC(\(n\))}.
    \end{enumerate}
  \end{enumerate}

\item Each dimension of the actual and dummy must correspond to the same
  dimension of their respective ultimate align targets, and
  corresponding elements of the actual and dummy must be aligned with
  the same corresponding elements of their respective ultimate align
  targets.
\end{enumerate}


\section{Restrictions on Pointers and Targets}
\label{mapsub:pointers}

If, on invocation of a procedure P: (a)~a dummy argument has the
\texttt{TARGET} attribute, and (b)~the corresponding actual argument
has the \texttt{TARGET} attribute and is not an array section with a
vector subscript (and therefore is an object A or a section of an
array A), then the program is not HPF-conforming unless:
\begin{enumerate}
\item No remapping of the actual argument occurs during the call; or
\item the remainder of program execution would be unaffected if
  \begin{enumerate}
  \item\label{first-item} each pointer associated with any portion of
    the dummy argument or with any portion of A during execution of P
    were to acquire undefined pointer association status on exit from
    P; and
  \item\label{second-item} each pointer associated with any portion of
    A before the call were to acquire undefined pointer association
    status on entry to P and, if not reassigned during execution of P,
    were to be restored on exit to the pointer association status it
    had before entry.
  \end{enumerate}
\end{enumerate}

Note that if a dummy argument has the \texttt{TARGET} attribute and no
explicit mapping attributes, then the \texttt{INHERIT} attribute is
implicitly assumed (see section~\ref{INHERIT-SECTION}); therefore no
remapping occurs for such a dummy argument and there is no problem.

\begin{rationale}
  These restrictions are made in order to support the following part
  of the Fortran standard (in Section 12.4.1.1 of that document) in
  the face of implicit remapping across the subprogram interface:
  \begin{quote}
    If the dummy argument does not have the \texttt{TARGET} or
    \texttt{POINTER} attribute, any pointers associated with the actual
    argument do not become associated with the corresponding dummy
    argument on invocation of the procedure.
  
    If the dummy argument has the \texttt{TARGET} attribute and the
    corresponding actual argument has the \texttt{TARGET} attribute but
    is not an array section with a vector subscript:
    \begin{enumerate}
    \item Any pointers associated with the actual argument become
      associated with the corresponding dummy argument on invocation of
      the procedure.
  
    \item When execution of the procedure completes, any pointers
      associated with the dummy argument remain associated with the actual
      argument.
    \end{enumerate}

    If the dummy argument has the \texttt{TARGET} attribute and the
    corresponding actual argument does not have the \texttt{TARGET}
    attribute or is an array section with a vector subscript, any pointers
    associated with the dummy argument become undefined when execution of
    the procedure completes.
  \end{quote}
\end{rationale}

\subsection{Example}

Here is an example that illustrates the restrictions of this section:

\CODE

      INTEGER, TARGET, DIMENSION (10) :: ACT
      INTEGER, POINTER, DIMENSON (:) :: POINTS_TO_ACT, POINTS_TO_DUM
!HPF$ DISTRIBUTE ACT(BLOCK)

      POINTS_TO_ACT => ACT
      CALL F(ACT)
      POINTS_TO_DUM(1) = 1             ! ILLEGAL

      CONTAINS
        SUBROUTINE F(DUM)
          INTEGER, TARGET, DIMENSION(10) :: DUM
        !HPF$ DISTRIBUTE DUM(CYCLIC)

          POINTS_TO_DUM => DUM
          POINTS_TO_ACT(1) = 1         ! ILLEGAL
        END SUBROUTINE
      END

\EDOC

The assignment to \texttt{POINTS_TO_DUM(1)} is illegal because it
violates item~\ref{first-item}; the assignment to
\texttt{POINTS_TO_ACT(1)} is illegal because it violates
item~\ref{second-item}.

\section{Argument Passing and Sequence Association}

For actual arguments in a procedure call, Fortran allows an array
element (scalar) to be associated with a dummy argument that is an
array.  It furthermore allows the shape of a dummy argument to differ
from the shape of the corresponding actual array argument, in effect
reshaping the actual argument via the procedure call.  Storage
sequence properties of Fortran are used to identify the values of the
dummy argument.  This feature, carried over from FORTRAN 77, has been
widely used to pass starting addresses of subarrays, rows or columns
of a larger array, to procedures.  For HPF arrays that are potentially
mapped across processors, this feature is not fully supported.


\subsection{Sequence Association Rules}

\begin{enumerate}
  
\item When an array element or the name of an assumed-size array is
  used as an actual argument, the associated dummy argument must be a
  scalar or specified to be a sequential array.
  
  An array-element designator of a nonsequential array must not be
  associated with a dummy array argument.
  
\item When an actual argument is an array or array section and the
  corresponding dummy argument differs from the actual argument in
  shape, then the dummy argument must be declared sequential and the
  actual array argument must be sequential.
  
\item An object of type character (scalar or array) is nonsequential
  if it conforms to the requirements of Definition~\ref{seq-var} of
  Section~\ref{sequence-defs}.  If the length of an explicit-length
  character dummy argument differs from the length of the actual
  argument, then both the actual and dummy arguments must be
  sequential.

  
\item Without an explicit interface, a sequential actual may not be
  associated with a nonsequential dummy and a nonsequential actual may
  not be associated with a sequential dummy.  (This item merely
  repeats part of Section~\ref{mapsub:ExplicitInterfaces}


\end{enumerate}


\subsection{Discussion of Sequence Association}

When the shape of the dummy array argument and its associated actual
array argument differ, the actual argument must not be an expression.
There is no HPF mechanism for declaring that the value of an
array-valued expression is sequential.  In order to associate such an
expression as an actual argument with a dummy argument of different
rank, the actual argument must first be assigned to a named array
variable that is forced to be sequential according to
Definition~\ref{seq-var} of Section~\ref{sequence-defs}.

\subsection{Examples of Sequence Association}

Given the following subroutine fragment:
                                                                \CODE
      SUBROUTINE HOME (X)
      DIMENSION X (20,10)
                                                                \EDOC
By rule 1
                                                                \CODE
      CALL HOME (ET (2,1))
                                                                \EDOC
is legal only if \texttt{X} is declared sequential in \texttt{HOME}
and \texttt{ET} is sequential in the calling procedure.

Likewise, by rule 2 and 4
                                                                \CODE
      CALL HOME (ET)
                                                                \EDOC
requires either that \texttt{ET} and \texttt{X} are both sequential arrays or 
that \texttt{ET} and \texttt{X} have the same shape and have the same
sequence attribute.


Rule 3 addresses a special consideration for  objects of type 
character. Change of the length of character objects across 
a call, as in

                                                                \CODE
      CHARACTER (LEN=44) one_long_word
      one_long_word = 'Chargoggagoggmanchaugagoggchaubunagungamaugg'
      CALL webster(one_long_word)

      SUBROUTINE webster(short_dictionary)
      CHARACTER (LEN=4) short_dictionary (11)
          !Note that short_dictionary(3) is 'agog', for example
                                                                \EDOC

\noindent
is conceptually legal in Fortran. In HPF, both the actual argument and
dummy argument must be sequential.
(Chargoggagoggmanchaugagoggchaubunagungamaugg is the original Nipmuc
name for what is now called Lake Webster in Massachusetts.)

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Sun Sep  8 18:57:38 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id SAA09660 for hpff-doc-out; Sun, 8 Sep 1996 18:57:38 -0500 (CDT)
Received: from mail.cis.ohio-state.edu (mail.cis.ohio-state.edu [164.107.8.55]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id SAA09654 for <hpff-doc@cs.rice.edu>; Sun, 8 Sep 1996 18:57:33 -0500 (CDT)
Received: from chicago.cis.ohio-state.edu (chicago.cis.ohio-state.edu [164.107.136.7]) by mail.cis.ohio-state.edu (8.6.7/8.6.4) with ESMTP id TAA01571 for <hpff-doc@cs.rice.edu>; Sun, 8 Sep 1996 19:57:31 -0400
From: P Sadayappan <saday@cis.ohio-state.edu>
Received: (saday@localhost) by chicago.cis.ohio-state.edu (8.6.7/8.6.4) id TAA16693 for hpff-doc@cs.rice.edu; Sun, 8 Sep 1996 19:57:31 -0400
Message-Id: <199609082357.TAA16693@chicago.cis.ohio-state.edu>
Subject: hpff-doc: overview.tex
To: hpff-doc@cs.rice.edu
Date: Sun, 8 Sep 1996 19:57:30 -0500 (EDT)
X-Mailer: ELM [version 2.4 PL22]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
% File: overview.tex

% Contents:
% Overview for HPF 2.0 document, including
%	Document/language structure
%	Brief descriptions of new features & extensions
%	Fortran language base
%	Notation and Syntax

% Revision history:
% May-10-96	Created by Charles Koelbel, Rice University
%		(copied from HPF 1.1 document)
% June 26-96	
% August 20-96  Edited by P. Sadayappan, Ohio State University
% September 8-96  Edited by P. Sadayappan, Ohio State University



\chapter{Overview}
\label{ch-overview}



{\em
Comments on this chapter should be directed to 
P. Sadayappan ({\tt saday@cis.ohio-state.edu}),
and {\tt hpff-doc@cs.rice.edu}.
Please use ``{\tt Comments on Overview}'' as the {\tt Subject:}
line.
\par
}


%Please be sure to add this material to some file that glombnf
%will read--- overview.tex  would be fine:
%Check these rule numbers versus the Fortran 95 specification!!!

\Fortranrule{R710}{add-op}
\Fortranrule{R706}{add-operand}
\Fortranrule{R625}{allocate-object}
\Fortranrule{R622}{allocate-stmt}
\Fortranrule{R431}{array-constructor}
\Fortranrule{R512}{array-spec}
\Fortranrule{R735}{assignment-stmt}
\Fortranrule{R838}{assign-stmt}
\Fortranrule{R1210}{call-stmt}
\Fortranrule{R529}{data-stmt}
\Fortranrule{R631}{deallocate-stmt}
\Fortranrule{R1221}{dummy-arg}
\Fortranrule{R1218}{end-function-stmt}
\Fortranrule{R1222}{end-subroutine-stmt}
\Fortranrule{R504}{entity-decl}
\Fortranrule{R215}{executable-construct}
\Fortranrule{R208}{execution-part}
\Fortranrule{R513}{explicit-shape-spec}
\Fortranrule{R723}{expr}
\Fortranrule{R1209}{function-reference}
\Fortranrule{R1215}{function-subprogram}
\Fortranrule{R914}{input-item}
\Fortranrule{R728}{int-expr}
\Fortranrule{R607}{int-variable}
\Fortranrule{R1204}{interface-body}
\Fortranrule{R210}{internal-subprogram-part}
\Fortranrule{R707}{level-2-expr}
\Fortranrule{R741}{mask-expr}
\Fortranrule{R705}{mult-operand}
\Fortranrule{R737}{namelist-group-object}
\Fortranrule{R543}{namelist-stmt}
\Fortranrule{R629}{nullify-stmt}
\Fortranrule{R915}{output-item}
\Fortranrule{R844}{pause-stmt}
\Fortranrule{R736}{pointer-assignment-stmt}
\Fortranrule{R630}{pointer-object}
\Fortranrule{R737}{read-stmt}
\Fortranrule{R734}{specification-expr}
\Fortranrule{R204}{specification-part}
\Fortranrule{R618}{section-subscript}
\Fortranrule{R623}{stat-variable}
\Fortranrule{R842}{stop-stmt}
\Fortranrule{R620}{stride}
\Fortranrule{R617}{subscript}
\Fortranrule{R619}{subscript-triplet}
\Fortranrule{R737}{target}
\Fortranrule{R501}{type-declaration-stmt}
\Fortranrule{R502}{type-spec}
\Fortranrule{R601}{variable}
\Fortranrule{R739}{where-construct}
\Fortranrule{R738}{where-stmt}
\Fortranrule{R737}{write-stmt}


This document specifies the form and establishes the interpretation of
programs expressed in the High Performance Fortran (HPF) language.  It
is designed as a set of extensions and modifications to the established
International Standard for Fortran. At the time of publication of this document,
the version of the standard used as a base is informally referred to as 
``Fortran~95'' (revision of ISO/IEC 1539:1991(E) and ANSIX3.198-1992 
(\cite{F9091}, to be published in October 1996).  In this overview 
chapter of the document, we outline the goals and scope of the language,
introduce the HPF language model, highlight the main features of the language,
and provide a guide to the rest of this document.

\section{Goals and Scope of High Performance Fortran}
\label{overview-goals}

The primary goals behind the development of the HPF language included:
\begin{itemize}

\item Portability across different architectures;

\item Data parallel programming (defined as single threaded, global name
space, and loosely synchronous parallel computation);

\item High performance on parallel computers with non-uniform memory
access costs (while not impeding performance on other machines); 

\item Use of Standard Fortran (currently Fortran 95) as a base; and

\item Open interfaces and interoperability with other languages (e.g. C)
and other programming paradigms (e.g. message passing using MPI).
\end{itemize}

Secondary goals included:
\begin{itemize}

\item Keeping the language understandable and implementable in a short
time span;

\item Provision of input to future standards activities for Fortran and C;

\item Provision of an evolutionary path for adding advanced features to
the base language in a consistent manner.
\end{itemize}

The first version of the language definition, HPF 1.0 was released in
May 1993. A number of language features that were defined in HPF 1.0
have now been absorbed into the current Fortran (95) language standard
(e.g. the {\tt FORALL} construct, {\tt PURE} and {\tt ELEMENTAL} procedures). 
These features
are therefore no longer detailed in the definition of HPF 2.0.
Information about the evolution of the HPF language (through versions 1.0,
1.1, and 2.0) and an enumeration of the differences between HPF 2.0 from
HPF 1.1 may be found in Annex~\ref{HPF-EVOLUTION}.

\section{HPF Language Model}

An important goal of HPF is to achieve code portability across a
variety of parallel machines.  This requires not only that HPF programs
compile on all target machines, but also that a highly-efficient HPF
program on one parallel machine be able to achieve reasonably high
efficiency on another parallel machine with a comparable number of
processors.  Otherwise, the effort spent by a programmer to achieve
high performance on one machine would be wasted when the HPF code is
ported to another machine.  Although 
shared-memory machines and distributed-memory machines may use 
different low-level primitives, there is broad
similarity with respect to the fundamental factors that affect the
performance of parallel programs on these machines. Thus, achieving high
efficiency across different parallel machines with the same high level
HPF program is a feasible goal. Some of the fundamental factors affecting
the performance of a parallel program are the degree of available parallelism,
exploitation of data locality, and choice of appropriate task granularity.
HPF provides mechanisms for the programmer to guide the compiler with respect
to these factors.

The first versions of HPF were defined as an extension of Fortran~90.
HPF 2.0 is defined as an extension to the current Fortran Standard (Fortran~95).
HPF will include and be consistent with advances in the Fortran standards, 
as they are approved by ISO.

Building on Fortran, HPF language features fall into four categories:

\begin{itemize}

\item HPF directives;

\item New language syntax;

\item New library routines; and

\item Language changes and restrictions.

\end{itemize}

HPF directives appear as structured comments that suggest implementation
strategies or assert facts about a program to the compiler.  They may
affect the efficiency of the computation performed, but generally do not change
the value computed by the program.  The form of the HPF directives has
been chosen so that a future Fortran standard may choose to include
these features as full statements in the language by deleting the
initial comment header.

A few new language features have been defined as direct extensions to 
Fortran syntax and interpretation.
The new HPF language features differ from HPF directives in that they 
are first-class 
language constructs rather than comments. They can directly affect the
result computed by a program. 

The HPF library of computational functions defines a standard interface
to routines that have proven valuable for high performance computing,
including additional reduction functions, combining scatter functions,
prefix and suffix functions, and sorting functions.

A small number of changes and restrictions to standard Fortran have also
been defined. The most significant restrictions are those imposed on the use of 
sequence and storage association, since they are not compatible with the use of
data distribution features of HPF. It is however possible to retain sequence
and storage association semantics in a program by use of certain explicit 
HPF directives.


The fundamental model of parallelism in HPF is that of single-threaded
data-parallel execution with a globally shared address space. Fortran 
array statements and the {\tt FORALL} statement are natural ways of
specifying data parallel computation. In addition, HPF provides the
{\tt INDEPENDENT} directive. It can be used to assert that certain loops
do not carry any dependences and therefore may be executed in parallel.

Exploitation of data locality is critical to achieving good performance on
high-performance computers, whether it be a uniprocessor workstation or a
parallel computer. On a Non-Uniform-Memory-Access (NUMA) parallel computer, 
the effective distribution of data among processor memories is very important
in reducing data movement overheads. One of the key features of HPF is the
facility for user specification of distribution of data. HPF provides a 
logical view of the parallel machine as a rectilinear arrangement of abstract
processors in one or more dimensions. The programmer can specify the relative
alignment of elements of different program arrays, and the distribution of 
arrays over the logical processor grid. Data mapping is specified using HPF
directives that can aid the compiler in optimizing parallel performance, 
but have no effect on the semantics of the program.

While HPF's single-threaded data-parallel model with a global name space 
is very convenient in many application contexts, other programming
languages (e.g. C) and other parallel programming paradigms (e.g. explicit
message-passing using MPI) may be more appropriate in certain contexts. In
recognition of this need, HPF formally defines the {\tt Extrinsic} mechanism 
to facilitate interoperability with other programming languages 
and/or paradigms.

The key concepts of HPF are illustrated below using simple examples.

                                                                \CODE
      REAL A(1000,1000)
!HPF$ PROCESSORS procs(4,4)
!HPF$ DISTRIBUTE (BLOCK,BLOCK) ONTO procs :: A
      DO k = 1, num_iter
         FORALL (i=2:999, j=2,999)
           A(i,j) = (A(i,j-1)+A(i-1,j)+A(i,j+1)+A(i+1,j))/4
         END FORALL
      END DO
                                                                \EDOC

The code fragment describes a simple Jacobi relaxation computation on
a two-dimensional floating-point array {\tt A}. The HPF directives
appear as structured comments. The {\tt PROCESSORS} directive specifies a
logical \( 4 \times 4 \) grid of processors {\tt proc}. The {\tt DISTRIBUTE} 
directive recommends that the compiler partition the array {\tt A} into
equal-sized blocks along each of its dimensions. This will result in a 
\( 4 \times 4 \)
configuration of blocks each containing \( 250 \times 250 \) elements, 
one block per processor. 
The {\tt PROCESSORS} and {\tt DISTRIBUTE} directive are described in detail
later in  Chapter~\ref{ch-mapping-base}.

The outer {\tt Do k} loop iterates over {\tt num\_iter}
Jacobi relaxation steps. The inner loop uses the Fortran~95 {\tt FORALL}
construct. It specifies the execution of the loop body for all values of
{\tt i} and {\tt j} in the range {\tt 2:99}. The semantics of the {\tt FORALL}
require that the right-hand-side expressions for all iterations
(i.e. for all values of {\tt i} and {\tt j} between {\tt 2}
and {\tt 99}) be evaluated before the assignments to the 
left-hand-side variables are performed.

When targeted for execution on a distributed-memory machine with {\tt 16}
processors, the HPF compiler generates SPMD code, with each processor 
locally containing a part of the global array {\tt A}. The outer {\tt k} loop
is executed sequentially while the inner {\tt FORALL} loop
is executed in parallel. Each processor will require some ``boundary'' elements
of {\tt A} that reside in partitions mapped to the local memories of other
processors. Message-passing primitives to achieve the necessary inter-processor
communication are inserted by the HPF compiler into the generated SPMD code.
The single-threaded data-parallel model with a global name-space makes it
convenient for the programmer to specify the strategy for parallelization 
and data partitioning at a higher level of abstraction. The tedious low-level
details of translating from an abstract global name space to the local
memories of individual processors and the management of explicit 
inter-processor communication through message-passing are left to the compiler.

The following example illustrates some of the communication implications of
scalar assignment statements.  The purpose is to illustrate the
implications of data distribution specifications on communication
requirements for parallel execution.  The explanations given do not
necessarily reflect the actual compilation process.

Consider the following statements:
								\CODE
      REAL a(1000), b(1000), c(1000), x(500), y(0:501)
      INTEGER inx(1000)
!HPF$ PROCESSORS procs(10)
!HPF$ DISTRIBUTE (BLOCK) ONTO procs :: a, b, inx
!HPF$ DISTRIBUTE (CYCLIC) ONTO procs :: c
!HPF$ ALIGN x(i) WITH y(i+1)
      ...
      a(i) = b(i)                    ! Assignment 1
      x(i) = y(i+1)                  ! Assignment 2
      a(i) = c(i)                    ! Assignment 3
      a(i) = a(i-1) + a(i) + a(i+1)  ! Assignment 4
      c(i) = c(i-1) + c(i) + c(i+1)  ! Assignment 5
      x(i) = y(i)                    ! Assignment 6
      a(i) = a(inx(i)) + b(inx(i))   ! Assignment 7
								\EDOC

In this example, the {\tt PROCESSORS} directive specifies a linear
arrangement of 10 processors.  The {\tt DISTRIBUTE} directives
recommend to the compiler that the arrays {\tt a}, {\tt b}, and {\tt
inx} should be distributed among the 10 processors with blocks of 100
contiguous elements per processor. The array {\tt c} is to be
cyclically distributed among the processors with {\tt c(1)}, {\tt
c(11)}, \ldots , {\tt c(991)} mapped onto processor {\tt procs(1)};
{\tt c(2)}, {\tt c(12)}, \ldots , {\tt c(992)} mapped onto processor
{\tt procs(2)}; and so on.  The complete mapping of arrays {\tt x} and
{\tt y} onto the processors is not specified, but their relative
alignment is indicated by the {\tt ALIGN} directive.  The {\tt ALIGN}
statement causes {\tt x(i)} and {\tt y(i+1)} to be stored on the same
processor for all values of {\tt i}, regardless of the actual
distribution chosen by the compiler for {\tt x} and {\tt y} ({\tt y(0)}
and {\tt y(1)} are not aligned with any element of {\tt x}).  The {\tt
PROCESSORS}, {\tt DISTRIBUTE}, and {\tt ALIGN} directives are discussed
in detail in Chapter~\ref{ch-mapping-base}.

In Assignment 1 ({\tt a(i) = b(i)}), the identical distribution of {\tt
a} and {\tt b} ensures that for all {\tt i}, {\tt a(i)} and {\tt b(i)}
are mapped to the same processor.  Therefore, the statement requires no
communication.

In Assignment 2 ({\tt x(i) = y(i+1)}), there is no inherent
communication.  In this case, the relative alignment of the two arrays
matches the assignment statement for any actual distribution of the
arrays.

Although Assignment 3 ({\tt a(i) = c(i)}) looks very similar to the
first assignment, the communication requirements are very different due
to the different distributions of {\tt a} and {\tt c}.  Array elements
{\tt a(i)} and {\tt c(i)} are mapped to the same processor for only
10\% of the possible values of {\tt i}.  (This can be seen by
inspecting the definitions of {\tt BLOCK} and {\tt CYCLIC} in
Chapter~\ref{ch-mapping-base}.) The elements are located on the same
processor if and only if \( \lfloor (i-1) / 100 \rfloor = (i-1) \bmod
10 \).  For example, the assignment involves no inherent communication
(i.e.,\ both {\tt a(i)} and {\tt c(i)} are on the same processor) if \(
i = 1 \) or \( i = 102 \), but does require communication if \( i = 2
\).

In Assignment 4 ({\tt a(i) = a(i-1) + a(i) + a(i+1)}), the references
to array {\tt a} are all on the same processor for about 98\% of the
possible values of {\tt i}.  The exceptions to this are \( i = 100k \)
for any \( k = 1, 2, \ldots, 9 \), (when {\tt a(i)} and {\tt a(i-1)}
are on {\tt procs(k)} and {\tt a(i+1)} is on {\tt procs(k+1)}) and  \(
i = 100k + 1 \) for any \( k = 1, 2, \ldots, 9 \) (when {\tt a(i)} and
{\tt a(i+1)} are on {\tt procs(k+1)} and {\tt a(i-1)} is on {\tt
procs(k)}).  Thus, except for ``boundary" elements on each processor,
this statement requires no inherent communication.

Assignment~5, {\tt c(i) = c(i-1) + c(i) + c(i+1)}, while superficially
similar to Assignment 4, has very different communication behavior.
Because the distribution of {\tt c} is {\tt CYCLIC} rather than {\tt
BLOCK}, the three references {\tt c(i)}, {\tt c(i-1)}, and {\tt c(i+1)}
are mapped to three distinct processors for any value of {\tt i}.
Therefore, this statement requires communication for at least two of
the right-hand side references, regardless of the implementation
strategy.

The final two assignments have very limited information regarding the
communication requirements.  In Assignment~6 ({\tt x(i) = y(i)}) the
only information available is that {\tt x(i)} and {\tt y(i+1)} are on
the same processor; this has no logical consequences for the
relationship between {\tt x(i)} and {\tt y(i)}.  Thus, nothing can be
said regarding communication in the statement without further
information.  In Assignment 7 ({\tt a(i) = a(inx(i)) + b(inx(i))}), it
can be proved that {\tt a(inx(i))} and {\tt b(inx(i))} are always
mapped to the same processor.  Similarly, it is easy to deduce that
{\tt a(i)} and {\tt inx(i)} are mapped together.  Without knowledge of
the values stored in {\tt inx}, however, the relation between {\tt
a(i)} and {\tt a(inx(i))} is unknown, as is the relationship between
{\tt a(i)} and {\tt b(inx(i))}.


\section{Overview of HPF~2.0 Language Features}
\label{overview-hpf2}

The definition of the HPF 2.0 language comprises of two main parts:
\begin{itemize}
\item The Base HPF~2.0 Language 
\item HPF~2.0 Approved Extensions
\end{itemize}
The base language features include those features that are expected of every
HPF~2.0 implementation within a few months of release of the 
language specification. This includes basic data distribution features,
data parallel features, a number of intrinsic and library routines, and
the extrinsic mechanism. The base language features are first described,
followed by the approved language extensions. 


\subsection{Base HPF~2.0 Language Features}

\subsubsection*{Data Distribution Features}

Most parallel and sequential architectures attain their highest speed
when the data accessed exhibits locality of reference. The sequential
storage order implied by Fortran standards often conflicts with
the locality demanded by the architecture. To avoid this, HPF includes
features which describe the collocation of data ({\tt ALIGN}) and the
partitioning of data among memory regions or abstract processors ({\tt
DISTRIBUTE}). Compilers may interpret these annotations to improve
storage allocation for data, subject to the constraint that
semantically every data object has a single value at any point in
the program.  In all cases, users should expect the compiler to arrange
the computation to minimize communication while retaining parallelism.
Chapter~\ref{ch-mapping-base} describes the distribution features and 
Chapter~\ref{ch-mapping-subr} defines how the mapping features
interact across subprogram boundaries.

\subsubsection*{Data Parallel Execution Features}

To express parallel computation explicitly, HPF extends 
Fortran with a new statement
and a  directive. The {\tt REDUCE} allows the user to identify
simple incremental operations inside loops as candidates for
parallelism. 
The {\tt INDEPENDENT} directive asserts that the statements in a
particular section of code do not exhibit any sequentializing
dependencies; when properly used, it does not change the semantics of the
construct, but may provide more information to the language processor
to allow optimizations. Chapter~\ref{ch-parallel} describes these
features.  

\subsubsection*{Extended Intrinsic Functions and Standard Library}

Experience with massively parallel machines has identified several
basic operations that are very valuable in parallel algorithm design.
The Fortran array intrinsics address some of these, but not
all.  HPF adds several classes of parallel
operations to the language definition as intrinsic functions and as standard
library functions.  In addition, several system inquiry functions
useful for controlling parallel execution are provided in HPF.
Chapter~\ref{ch-library} describes these functions and subroutines.

\subsubsection*{Extrinsic Procedures}

Because HPF is designed as a high-level, machine-independent language,
there are certain operations that are difficult or impossible to
express directly.  For example, many applications benefit from
finely-tuned systolic communications on certain machines; HPF's global
address space does not express this well.  Extrinsic procedures define
an explicit interface to procedures written in other paradigms, such as
explicit message-passing subroutine libraries.  Chapter~\ref{ch-extrinsics}
describes this interface. Specific extrinsic
interfaces are defined in Chapter~\ref{ch-extrinsic-ext} and 
Appendix~\ref{ch-extrinsic-app}.

\subsubsection*{Sequence and Storage Association}

A goal of HPF is to maintain compatibility with Fortran.  Full
support of Fortran sequence and storage association, however, is not
compatible with the goal of high performance through distribution of
data in HPF.  Some forms of associating subprogram dummy arguments
with  actual values make assumptions about the sequence of values in
physical memory which may be incompatible with data distribution.
{\tt COMMON} and {\tt EQUIVALENCE} statements are recognized as
requiring a modified storage association paradigm. HPF
provides a directive to assert that full sequence and storage
association for designated variables must be maintained.  In the absence
of such explicit directives, reliance on the properties of association
is not allowed.  An optimizing compiler may then choose to distribute
any variables across processor memories in order to improve
performance.  To protect program correctness, a given implementation
should provide a mechanism to ensure that all such default optimization
decisions are consistent across an entire program.
Chapter~\ref{ ch-mapping-subr } describes the restrictions and directives
related to storage and sequence association.  


\subsection{HPF~2.0 Approved Extensions}

\subsubsection*{Extensions for Data Mapping}

The extended mapping features permit greater control over the mapping
of data, including facilities for dynamic realignment and redistribution
of arrays at run-time ({\tt REALIGN, REDISTRIBUTE, DYNAMIC} directives), 
mapping of data among subsets of processors, and
support for irregular distribution of data ({\tt GEN\_BLOCK} and 
{\tt INDIRECT} distributions). In addition, mechanisms are
defined that permit the programmer to provide information to the compiler
about the range of possible distributions an array might take ({\tt RANGE}
directive) and the amount of buffering to be used with arrays involved
in stencil-based nearest-neighbor computations ({\tt SHADOW}).
The approved language extensions pertaining to data mapping are described
in Chapter~\ref{ch-mapping-ext}.

\subsubsection*{Extensions for Data and Task Parallelism}

The extended features facilitate explicit computation partitioning
through use of the {\tt ON} directive. The site of recommended execution
of a loop iteration can either be explicitly specified as a specific
processor or indirectly as the one on which a particular variable or
template element is mapped. In order to assist the compiler in
generating efficient code, the {\tt RESIDENT} directive is defined, to be
used in conjunction with an {\tt ON} directive by the programmer. It can
be used to assert that all accesses to the specified variable within the scope
of the {\tt ON} directive are to be found locally on the executing processor.
Specification of task-level parallelism in HPF is facilitated by the new
{\tt TASKING} construct. The {\tt ON}, {\tt RESIDENT} and {\tt TASKING}
directives are detailed in Chapter~\ref{ch-parallel-ext}.

\subsubsection*{Extensions to Intrinsic and Library Procedures}

The approved extensions to the HPF intrinsics and library routines
relate mostly to mapping inquiry procedures. Some new inquiry routines
are defined and other routines defined by Base HPF~2.0 are extended to
facilitate inquiry about extended mapping features, such as mapping
mapping to processor subsets, {\tt GEN\_BLOCK}, {\tt INDIRECT} and
{\tt DYNAMIC} distributions. A generalization of the Fortran
{\tt TRANSPOSE} intrinsic is also defined. These extensions are defined
in Chapter~\ref{ch-library-ext}.

\subsubsection*{Extensions for Asynchronous I/O}

In order to permit overlap of I/O with computation, an extension 
has been defined for asynchronous {\tt READ} of direct, unformatted
data. This is done through a new statement ({\tt WAIT}) and an
additional I/O control parameter in the Fortran {\tt READ} statement
that specifies non-blocking execution. This extension is described in
Chapter~\ref{ch-io-ext}.

\subsubsection*{Extensions for Extrinsic Interfaces}

A number of specific extrinsic interfaces are defined in
Chapter~\ref{ch-extrinsic-ext} as approved HPF~2.0 extended features.
These include interfaces with different models of parallelism ({\tt LOCAL})
for SPMD parallel, and {\tt SERIAL} for single-process sequential) and
different languages (FORTRAN and C). A set of library routines are also
defined (for FORTRAN) that permit inquiry about the data visible in the
callee's space and its relation to the global view of data in the calling 
HPF space. Chapter~\ref{ch-fortran77-ext} defines an extrinsic interface for
{\tt LOCAL FORTRAN\_77} routines. Additional extrinsic interfaces are
included in Appendix~\ref{ch-extrinsic-app}, that are formally recognized by
HPFF, but not defined and maintained by HPFF. The policy and mechanism
for formal recognition of such extrinsic interfaces is also set out in
Appendix~\ref{ch-extrinsic-app}.

\section{Notation}
\label{overview-notation}

This document uses the same notation as the Fortran standard.  In
particular, the same conventions are used for syntax rules.  BNF
descriptions of language features are given in the style used in the
Fortran standard.  To distinguish HPF syntax rules from Fortran
rules, each HPF rule has an identifying number of the form H\(snn\),
where \(s\) is a one-digit major section number and \(nn\) is a one- or
two-digit sequence number.  The syntax rules are also collected in
Annex~\ref{SYNTAX-ANNEX}.  Nonterminals not defined in this document
are defined in the Fortran standard.  Also note that certain
technical terms such as ``storage unit'' are defined by the Fortran
standard; Annex~\ref{CREF-ANNEX} identifies the Fortran rules 
defining these nonterminals.  References in parentheses in the text
refer to the Fortran (95) standard.

\begin{rationale}
Throughout this document, material explaining the rationale for including 
features, choosing particular feature definitions, and other decisions is 
set off in this format.  Readers interested in the language definition 
only may wish to skip these sections, while readers interested in 
language design may want to read them more carefully.
\end{rationale}

\begin{users}
Throughout this document, material that is primarily commentary for users
(including most examples of syntax and interpretation) 
is set off in this format.  Readers interested in 
technical material only may wish to skip these sections, while readers 
wanting a more basic approach may want to read them more carefully.
\end{users}

\begin{implementors}
Throughout this document, material that is primarily commentary for 
implementors is set off in this format.  Readers interested in the 
language definition only may wish to skip these sections, while readers 
interested in compiler implementation may want to read them more carefully.
\end{implementors}



%where does this really go???
\section{Syntax of Directives}

HPF directives are consistent with Fortran  syntax in the following
sense: if any HPF directive were to be adopted as part of a future
Fortran standard, the only change necessary to convert an HPF program
would be to replace the directive-origin with blanks.


THE FOLLOWING BNF NEEDS TO BE UPDATED FOR ANY NEW DIRECTIVE CATEGORIES
ALONG WITH STRATEGY FOR EXTENDED VS 2.0 DIRECTIVES


                                                      \BNF
hpf-directive-line     \IS   directive-origin hpf-directive


directive-origin       \IS   !HPF$
                       \OR   CHPF$
                       \OR   *HPF$

hpf-directive          \IS   specification-directive
                       \OR   executable-directive

specification-directive  \IS   processors-directive
                       \OR   align-directive
                       \OR   distribute-directive
                       \OR   dynamic-directive
                       \OR   inherit-directive
                       \OR   template-directive
                       \OR   combined-directive
                       \OR   sequence-directive

executable-directive   \IS   realign-directive
                       \OR   redistribute-directive
                       \OR   independent-directive
                                                      \FNB

\begin{constraints}

\item An {\it hpf-directive-line} cannot be commentary
following another statement on the same line.

\item A {\it specification-directive} may appear only
where a {\it declaration-construct} may appear.

\item An {\it executable-directive} may appear only where an {\it
executable-construct} may appear.

\item An {\it hpf-directive-line} follows the rules of either Fortran
 free form (3.3.1.1) or fixed form (3.3.2.1) comment lines, depending
on the source form of the surrounding Fortran  source form in that
program unit. (3.3)

\end{constraints}

An {\it hpf-directive} is case insensitive and
conforms to the rules for blanks in free
source form (3.3.1), even in an HPF program otherwise in fixed source
form.  However an HPF-conforming processor is not required to diagnose
extra or missing blanks in an HPF directive.
Note that, due to Fortran  rules, the {\it directive-origin}
in free source form must be the characters {\tt !HPF$}.
HPF directives may be continued, in which case
each continued line also begins with a
{\it directive-origin}. No  statements may
be interspersed within a continued HPF-directive.  HPF directive lines
must not appear within a continued statement.
HPF directive lines may include trailing commentary.

In either source form, the blanks in the adjacent keywords
{\tt END FORALL} and {\tt NO SEQUENCE} are optional.

An example of an HPF directive continuation in free source form is:

                                                      \CODE
!HPF$ ALIGN ANTIDISESTABLISHMENTARIANISM(I,J,K) &
!HPF$        WITH ORNITHORHYNCHUS_ANATINUS(J,K,I)
                                                      \EDOC

An example of an HPF directive continuation in fixed source form follows.
Observe that column 6 must be blank, except when signifying continuation.

                                                       \CODE
!HPF$ ALIGN ANTIDISESTABLISHMENTARIANISM(I,J,K)
!HPF$*WITH ORNITHORHYNCHUS_ANATINUS(J,K,I)
                                                       \EDOC

This example shows  an HPF directive continuation which is ``universal'' in
that it can be treated as either fixed source form or free source form.
Note that the ``{\tt \&}'' in the first line is in column 73.

                                                       \CODE
!HPF$ ALIGN ANTIDISESTABLISHMENTARIANISM(I,J,K)                         &
!HPF$&WITH ORNITHORHYNCHUS_ANATINUS(J,K,I)
                                                       \EDOC

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Mon Sep  9 10:57:49 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id KAA29484 for hpff-doc-out; Mon, 9 Sep 1996 10:57:49 -0500 (CDT)
Received: from mail12.digital.com (mail12.digital.com [192.208.46.20]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id KAA29478 for <hpff-doc@cs.rice.edu>; Mon, 9 Sep 1996 10:57:46 -0500 (CDT)
From: loveman@ovid.eng.pko.dec.com
Received: from Jaxom.Eng.PKO.DEC.Com by mail12.digital.com (8.7.5/UNX 1.2/1.0/WV)
	id LAA11175; Mon, 9 Sep 1996 11:43:42 -0400 (EDT)
Received: from ovid.Eng.PKO.DEC.Com by Jaxom.Eng.PKO.DEC.Com; (5.65/1.1.8.2/15Jan96-8.2MAM)
	id AA28470; Mon, 9 Sep 1996 11:43:40 -0400
Received: by ovid.eng.pko.dec.com; id AA29011; Mon, 9 Sep 1996 11:40:49 -0400
Message-Id: <9609091540.AA29011@ovid.eng.pko.dec.com>
To: zongaro@vnet.ibm.com, meltzer@cray.com, hpff-doc@cs.rice.edu
Cc: loveman@ovid.eng.pko.dec.com
Subject: hpff-doc: Comments on Interoperability
Date: Mon, 09 Sep 96 11:40:48 -0400
X-Mts: smtp
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

I have decided to drop completely any discussion of MAP_TO in 
chapter 6.  You might want to consider using some of what was
in my last draft, see below.


===========================================================


\subsection{Dummy Arguments}

\BNF

map-to                  \IS MAP_TO ( map-to-kind-arg-spec-list )

map-to-kind-arg-spec    \IS map-to-type

			\OR map-to-layout

			\OR map-to-pass-by

map-to-type             \IS [ TYPE = ] char-initialization-expr

map-to-layout           \IS [ LAYOUT = ] char-initialization-expr

map-to-pass-by          \IS [ PASS_BY = ] char-initialization-expr

\FNB

\begin{constraints}

\item The definition of {\em characteristics of a dummy data object} as
given in Section 12.2.1.1 of the May 1995 draft Fortran 95 standard is
extended to include the dummy data object's data mapping (alignment and
distribution) and its map to characteristics.

\item The rules for {\it map-to-kind-arg-spec-list} are as if {\tt
MAP_TO} were a procedure with an explicit interface with a {\it
dummy-arg-list} of {\tt TYPE, LAYOUT, PASS_BY}, each of which were {\tt
OPTIONAL}. Note that, in an {\it map-to-kind-arg-spec-list},  at least
one of {\it map-to-type}, {\it map-to-layout}, or {\it map-to-pass-by}
must occur.

\item In {\it map-to-type}, values of {\it char-initialization-expr}
are intended to describe how the data type of the named actual argument
is mapped to the data type of the dummy argument in the extrinsic
procedure. An example is given in Chapter ??.

\item In {\it map-to-layout}, values of {\it char-initialization-expr}
are intended to describe how the data layout of the named actual
argument is mapped to the data layout of the dummy argument in the
extrinsic procedure. An example is given in Chapter ??.

\item In {\it map-to-pass-by}, values of {\it char-initialization-expr}
are intended to describe the mechanism used to associate the named
actual argument with the dummy argument in the extrinsic procedure. An
example is given in Chapter ??.

\end{constraints}




---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Mon Sep  9 11:33:34 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id LAA01536 for hpff-doc-out; Mon, 9 Sep 1996 11:33:34 -0500 (CDT)
Received: from mail12.digital.com (mail12.digital.com [192.208.46.20]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id LAA01523 for <hpff-doc@cs.rice.edu>; Mon, 9 Sep 1996 11:33:25 -0500 (CDT)
From: loveman@ovid.eng.pko.dec.com
Received: from Jaxom.Eng.PKO.DEC.Com by mail12.digital.com (8.7.5/UNX 1.2/1.0/WV)
	id MAA00113; Mon, 9 Sep 1996 12:26:53 -0400 (EDT)
Received: from ovid.Eng.PKO.DEC.Com by Jaxom.Eng.PKO.DEC.Com; (5.65/1.1.8.2/15Jan96-8.2MAM)
	id AA29233; Mon, 9 Sep 1996 12:26:52 -0400
Received: by ovid.eng.pko.dec.com; id AA29078; Mon, 9 Sep 1996 12:24:03 -0400
Message-Id: <9609091624.AA29078@ovid.eng.pko.dec.com>
To: hpff-doc@cs.rice.edu
Cc: loveman@ovid.eng.pko.dec.com
Subject: hpff-doc: extrinsics.tex
Date: Mon, 09 Sep 96 12:24:02 -0400
X-Mts: smtp
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
% File: extrinsics.tex

% Contents:  Extrinsics chapter for HPF 2.0 document

% Revision history:

% May-10-96     Created by Charles Koelbel, Rice University (from HPF 1.2 
document)

% Aug-15-96     Draft stage new chapter (David Loveman) in interim stage (mez)

% Sep-09-96     Next draft of chapter by David Loveman, Digital.

\chapter{Extrinsic Program Units}

\label{ch-extrinsics}

{\em Comments on this chapter should be directed to David Loveman ({\tt
loveman@ovid.eng.pko.dec.com}) and {\tt hpff-doc@cs.rice.edu}. Please
use ``{\tt Comments on Extrinsics}'' as the {\tt Subject:} line.}

The HPF global model of computation extends (and restricts) Fortran to
provide programmers with the Fortran model of computation implementable
efficiently on a wide class of hardware architectures with, in general,
multiple processors, multiple memories with non-uniform access
characteristics, and multiple interconnections. This model of
computation presents a single logical thread of control, including
Fortran's data parallel features such as array syntax and the forall
statement, and data visibility defined by the scoping rules of Fortran.
In particular, this model does not require the use of low-level
features such as threads libraries and explicit message passing to
exploit such architectures. Programmers expect their HPF compilers to
generate efficient code by using HPF's features to assist in mapping
data and computation to the given hardware architecture.

This chapter defines the {\it extrinsic} mechanism by which HPF program
units may use non-HPF program units that don't use the HPF global
model. It describes how to write an explicit interface for a non-HPF
procedure and defines the caller's assumptions about handling
distributed and replicated data at the interface. This allows the
programmer to use non-HPF language facilities, for example to descend
to a lower level of abstraction to handle problems that are not
efficiently addressed by HPF, to hand-tune critical kernels, or to call
optimized libraries. Such an interface can also be used to interface
HPF to other languages, such as C.

\section{Overview}

An HPF program may need to call a procedure implemented in a different
programming model or in a different programming language. A procedure's
{\it programming model} might provide:

\begin{itemize}

\item a single logical thread-of-control where {\it one} copy of the
procedure is conceptually executing and there is a single locus of
control within the program text; this model is called {\it global} when
the underlying target hardware has (potentially) multiple processors or
memories and is called {\it serial} when the underlying target hardware
is treated as a uniprocessor (or a single node in a multiprocessor).

\item multiple threads-of-control, perhaps with dynamic assignment of
loop iterations to processors or explicit dynamic process forking,
where there is, at least initially upon invocation {\it one} copy of
the procedure that is conceptually executing but which may spawn
multiple loci of control, possibly changing in number over time, within
the program text, or

\item multiple threads-of-control, one per processor, each thread
executing the same procedure; this model is called {\it local} or, more
generally, SPMD (Single Program, Multiple Data).

\end{itemize}

A {\it programming language} provides a specific syntax (language
features), semantics (meanings), and pragmatics (purposes). Examples of
programming languages include Fortran (an ANSI and ISO standard, the
most recent revision should be approved in 1996), HPF (a specification
of extensions and restrictions to Fortran), Fortran 77 (a previous ANSI
and ISO standard), C, C++, Java, Visual Basic, and COBOL.

A program unit's language and model, when taken together, constitute
its {\it extrinsic kind}. This {\it extrinsic kind} may be specified
explicitly by an {\it extrinsic-prefix} or implicitly by the selection
of a compiler and its invocation with a particular set of compiler
options. Thus, one might view the compiler as providing a {\it host
scoping unit} as defined by Fortran. For example, a program unit
compiled by an HPF compiler will be of extrinsic kind {\tt HPF}.
Alternatively, its extrinsic kind may be specified explicitly by an
{\it extrinsic-prefix} such as {\tt EXTRINSIC(HPF)} or {\tt
EXTRINSIC(LANGUAGE="HPF", MODEL="GLOBAL")}.

\section{Declaration of Extrinsic Program Units}

\subsection{Function and Subroutine Statements}

An {\it extrinsic-prefix} may appear in a {\it function-stmt} or {\it
subroutine-stmt} (as defined in the Fortran standard) in the same place
that the keywords {\tt RECURSIVE}, {\tt PURE}, and {\tt ELEMENTAL} may
appear. This is specified by an extension of rule R1219 for {\it
prefix-spec} in the May 1995 draft Fortran 95 standard. Rules R1217 for
{\it function-stmt}, R1218 for {\it prefix}, and R1222 for {\it
subroutine-stmt} are not changed, but are rewritten here for
reference.

\BNF
function-stmt           \IS [ prefix ] FUNCTION function-name
			      ( [ dummy-arg-name-list ] )
			      [ RESULT ( result-name ) ]
subroutine-stmt         \IS [ prefix ] SUBROUTINE subroutine-name
			      [ ( [ dummy-arg-list ] ) ]
prefix                  \IS prefix-spec [ prefix-spec ] ...
prefix-spec             \IS type-spec
			\OR RECURSIVE
			\OR PURE
			\OR ELEMENTAL
			\OR extrinsic-prefix
\FNB

\begin{constraints}

\item The definition of {\em characteristics of a procedure} as given
in Section 12.2 of the May 1995 draft Fortran 95 standard is extended
to include the procedure's extrinsic kind.

\item Within any HPF {\it external-subprogram}, every {\it
internal-subprogram} must be of the same extrinsic kind as its host and
any {\it internal-subprogram} whose extrinsic kind is not given
explicitly is assumed to be of that extrinsic kind.

\end{constraints}

\subsection{Program, Module, and Block Data Statements}

An {\it extrinsic-prefix} may also appear at the beginning of a {\it
program-stmt}, {\it module-stmt}, or {\it block-data-stmt}. The
following syntax definition extends the Fortran 95 syntax rules R1102
for {\it program-stmt}, R1105 for {\it module-stmt}, and R1111 for {\it
block-data-stmt}.

\BNF
program-stmt            \IS [ extrinsic-prefix ] PROGRAM program-name
module-stmt             \IS [ extrinsic-prefix ] MODULE module-name
block-data-stmt         \IS [ extrinsic-prefix ] BLOCK DATA [ block-data-name ]
\FNB

\begin{constraints}

\item Within any HPF {\it module}, every {\it module-subprogram} must
be of the same extrinsic kind as its host and any {\it
module-subprogram} whose extrinsic kind is not given explicitly is
assumed to be of that extrinsic kind.

\item Within any HPF {\it main-program} or {\it module-subprogram},
every {\it internal-subprogram} must be of the same extrinsic kind as
its host and any {\it internal-subprogram} whose extrinsic kind is not
given explicitly is assumed to be of that extrinsic kind.

\item In any HPF {\it block-data-stmt}, if an {\it extrinsic-prefix} is
present, a {block-data-name} must also be present.

\end{constraints}

\subsection{The Extrinsic Prefix}

\BNF
extrinsic-prefix        \IS EXTRINSIC ( extrinsic-spec )
extrinsic-spec          \IS extrinsic-spec-arg-list
			\OR extrinsic-kind-keyword
extrinsic-spec-arg      \IS language
			\OR model
			\OR external-name
language                \IS [ LANGUAGE = ] char-initialization-expr
model                   \IS [ MODEL = ] char-initialization-expr
external-name           \IS [ EXTERNAL_NAME = ] char-initialization-expr
\FNB

The following constraints are in addition to those specified in the May
1995 draft Fortran 95 standard.

\begin{constraints}

\item  The rules for {\it extrinsic-spec-arg-list} are as if {\tt
EXTRINSIC} were a procedure with an explicit interface with a {\it
dummy-arg-list} of {\tt LANGUAGE, MODEL, EXTERNAL_NAME}, each of which
were {\tt OPTIONAL}. Note that, in an {\it extrinsic-spec-arg-list}, at
least one of {\it language}, {\it model}, or {\it external-name} must
occur.

\item In {\it language}, values of {\it char-initialization-expr} may
be:

\begin{itemize}

\item {\tt "HPF"}, referring to the HPF language; if a {\it model} is
not explicitly specified, the default {\it model} is {\tt "GLOBAL"};

\item {\tt "FORTRAN"}, referring to the ANSI/ISO standard Fortran
language; if a {\it model} is not explicitly specified, the default
{\it model} is {\tt "SERIAL"};

\item {\tt "C"}, referring to the ANSI standard C programming language;
if a {\it model} is not explicitly specified, the default {\it model}
is {\tt "SERIAL"}; or

\item an implementation-dependent value with an
implementation-dependent default {\it model}.

\end{itemize}

If {\it language} is not specified, it is the same as that of the host
scoping unit in which the {\it extrinsic-prefix} occurs.

\item In {\it model}, values of {\it char-initialization-expr} may be:

\begin{itemize}

\item {\tt "GLOBAL"}, referring to the global model,

\item {\tt "LOCAL"}, referring to the local model,

\item {\tt "SERIAL"}, referring to the serial model, or

\item an implementation-dependent value.

\end{itemize}

If {\it model} is not specified, it is the same as that of the host
scoping unit in which the {\it extrinsic-prefix} occurs.

\item All {\it language}s and {\it model}s whose names begin with the
three letters ``{\tt HPF}'' are reserved for present or future
definition by this specification and its successors.

\item In {\it external-name}, the value of {\it
char-initialization-expr} is the name by which the program unit is
known outside of the program, for example in the external file system
or program library. If {\it external-name} is not specified, its value
is implementation-dependent.

\end{constraints}

HPF defines three {\it extrinsic-kind-keyword}s: {\tt HPF}, {\tt
HPF_LOCAL}, and {\tt HPF_SERIAL}.

\BNF
extrinsic-kind-keyword  \IS HPF
			\OR HPF_LOCAL
			\OR HPF_SERIAL
\FNB

\begin{constraints}

\item {\tt extrinsic(HPF)} is equivalent to {\tt extrinsic("HPF",
"GLOBAL")}. In the absence of an {\it extrinsic-prefix} an HPF compiler
interprets a compilation unit as if it were of extrinsic kind {\tt
HPF}. Thus, for an HPF compiler, specifying {\tt EXTRINSIC(HPF)} or
{\tt extrinsic("HPF", "GLOBAL")} is redundant. Such explicit
specification may, however, be required for use with a compiler that
supports multiple extrinsic kinds.

\item {\tt extrinsic(HPF_LOCAL)} is equivalent to {\tt extrinsic("HPF",
"LOCAL")}. A {\it main-program} whose extrinsic kind is {\tt HPF_LOCAL}
behaves as if it were a subroutine of extrinsic kind {\tt HPF_LOCAL}
that is called with no arguments from a main program of extrinsic kind
{\tt HPF} whose executable part consists solely of that call.

\item {\tt extrinsic(HPF_SERIAL)} is equivalent to {\tt
extrinsic("HPF", "SERIAL")}. A {\it main-program} whose extrinsic kind
is {\tt HPF_SERIAL} behaves as if it were a subroutine of extrinsic
kind {\tt HPF_LOCAL} that is called with no arguments from a main
program of extrinsic kind {\tt HPF} whose executable part consists
solely of that call.

\item All {\it extrinsic-kind-keyword}s whose names begin with the
three letters ``{\tt HPF}'' are reserved for present or future
definition by this specification and its successors.

\end{constraints}

\begin{implementors}

Other {\it language}s, {\it model}s, or {\it extrinsic-kind-keyword}s
may be defined and provided by compiler vendors. Although not part of
this HPF specification, they are expected to conform to the rules and
spirit of HPF extrinsic kinds.

An implementation may place certain restrictions on the programmer;
moreover, each extrinsic kind may call for a different set of
restrictions.

For example, an implementation on a parallel processor may find it
convenient to replicate scalar arguments so as to provide a copy on
every processor. This is permitted so long as this process is invisible
to the caller. One way to achieve this is to place a restriction on the
programmer: on return from the subprogram, all the copies of this
scalar argument must have the same value. This implies that if the
dummy argument has {\tt INTENT(OUT)}, then all copies must have been
updated consistently by the time of subprogram return.

\end{implementors}

\section{Calling HPF Extrinsic Subprograms}

A call to an extrinsic procedure behaves, as observed by a calling
program coded in HPF, exactly as if the subprogram were coded in HPF.
If a function or subroutine called from a program unit of an HPF
extrinsic kind does not have an explicit interface visible in the
caller, it is assumed to have the same extrinsic kind as the caller.

In order to call a subprogram of an extrinsic kind other than that of
the caller, that subprogram must have an explicit interface visible in
the caller, and the subprogram is expected to behave, as observed by
the caller, roughly as if it had been written as code of the same
extrinsic kind as the caller. Some of the responsibility for meeting
this requirement may rest on the compiler and some on the programmer.
This interface defines the ``HPF view'' of the extrinsic procedure.

A called procedure that is written in a model or language other than
HPF, whether or not it uses the local procedure execution model, should
be declared {\tt EXTRINSIC} within an HPF program that calls it. The
{\tt EXTRINSIC} prefix declares what sort of interface should be used
when calling indicated subprograms. If there is no extrinsic
specification then the users must assume full responsibility for
correctness of the implementation-dependent interface.

A {\it function-stmt} or {\it subroutine-stmt} that appears within an
{\it interface-block} within a program unit of an HPF extrinsic kind
may have an extrinsic prefix mentioning any extrinsic kind supported by
the language implementation. If no {\it extrinsic-prefix} appears in
such a {\it function-stmt} or {\it subroutine-stmt}, then it is assumed
to be of the same HPF extrinsic kind as the program unit in which the
interface block appears.

The procedure characteristics defined by an {\it interface-body} must
be consistent with the procedure's definition.

The definition and rules for a procedure with an extrinsic interface
lies outside the scope of HPF. However, explicit interfaces to such
procedures must conform to HPF. Note that any particular HPF
implementation is free to support any selection of extrinsic kinds, or
none at all except for {\tt HPF} itself which clearly must be supported
by an HPF implementation.

\subsection{Access to Types, Procedures, and Data}

If a module X of one HPF extrinsic kind is used by a program unit Y of
another HPF extrinsic kind, then only names of items in X that Y is
entitled to use or invoke may be imported; that is, either X makes
private all items that Y is not entitled to use, or the {\tt USE}
statement in Y has an {\tt ONLY} options that lists only names of items
it is entitled to use.

Derived type definitions may be thought of as ``extrinsic kind
neutral;'' a program unit of any HPF extrinsic kind may use derived
type definitions from a module of any HPF extrinsic kind.

An HPF global program or procedure may call other HPF procedures that
are global, local, or scalar.

An HPF local program or procedure may only call other HPF local
procedures.

An HPF scalar program or procedure may only call other HPF scalar
procedures.

A named {\tt COMMON} block in any program unit of an HPF kind will be
associated with the {\tt COMMON} block, if any, of that same name in
every other program unit of that same extrinsic kind; similarly for
unnamed {\tt COMMON}. (Such {\tt COMMON} storage behaves as other
declared data objects within program units of that extrinsic kind; in
particular, for {\tt HPF_LOCAL} code there will be one copy of the {\tt
COMMON} block on each processor.)

It is not permitted for any given {\tt COMMON} block name to be used in
program units of different HPF kinds within a single program;
similarly, it is not permitted for unnamed {\tt COMMON} to be used in
program units of different HPF kinds within a single program.

A particular restriction is placed on HPF local procedures to be called
from an HPF procedure or global program: array dummy arguments and
array components of dummy arguments of a derived type must be declared
as assumed-shape, both in the definition of the subprogram itself and
in any interface blocks in other program units.

\begin{table}
\def\QQ#1#2#3{\hbox to 5em{{\bf #1}\quad{\bf #2}\quad{\bf #3}\hfill}}
\begin{tabular}{lr|c|c|c}
&&\multicolumn{3}{c}                 {extrinsic kind of the used module} \\
	       &                &{\tt HPF}     &{\tt HPF_SERIAL}&{\tt
	       HPF_LOCAL} \\ \hline
extrinsic kind &{\tt HPF}       & \QQ{T}{P}{D} & \QQ{T}{P}{ }   &
\QQ{T}{P}{ }   \\ \hline
of the using   &{\tt HPF_SERIAL}& \QQ{T}{ }{ } & \QQ{T}{P}{D}   &
\QQ{T}{ }{ }   \\ \hline
program unit   &{\tt HPF_LOCAL} & \QQ{T}{ }{ } & \QQ{T}{ }{ }   &
\QQ{T}{P}{D}   \\[8pt]
\multicolumn{5}{l}   {{\bf T} = derived type definitions} \\
\multicolumn{5}{l}   {{\bf P} = procedures and procedure interfaces} \\
\multicolumn{5}{l}   {{\bf D} = data objects}
\end{tabular}
\caption{Entities that a using program unit is entitled to access from
a module, according to the HPF extrinsic kind of each}
\end{table}

\subsection{The Effect of a Call}

A call to an extrinsic procedure must be semantically equivalent to a
call of an ordinary HPF procedure. Thus a call to an extrinsic
procedure must behave {\it as if} the following actions occur. The HPF
technical term {\it as if} means that the described actions should
appear to a user as if they occurred, in the order specified; an
implementation may carry out any actions in any order that provide the
correct user-visible effects.

\begin{enumerate}

\item All actions of the caller preceding the subprogram invocation
should be completed before any action of the subprogram is executed;
and all actions of the subprogram should be completed before any action
of the caller following the subprogram invocation is executed.

\item Each actual argument is remapped, if necessary, according to the
directives (explicit or implicit) in the declared interface for the
extrinsic procedure. Thus, HPF mapping directives appearing in the
interface are binding---the compiler must obey these directives in
calling local extrinsic procedures. Actual arguments corresponding to
scalar dummy arguments are replicated (by broadcasting, for example) in
all processors. As in the case of non-extrinsic subprograms, actual
arguments may be mapped in any way; if necessary, they are copied
automatically to correctly mapped temporaries before invocation of and
after return from the extrinsic procedure. The default mapping of
scalar dummy arguments and of scalar function results is such that the
argument is replicated on each physical processor. These mappings may,
optionally, be explicit in the interface, but any other explicit
mapping is not HPF conforming.

\item {\tt IN}, {\tt OUT}, and {\tt INOUT} intent restrictions should
be observed.

\item No HPF variable is modified unless it could be modified by an HPF
procedure with the same explicit interface. Note in particular that
even though an {\tt HPF_LOCAL} routine is not permitted to access and
modify HPF global data, other kinds of extrinsic routines may do so to
the extent that an HPF procedure could.

\item When a procedure returns and the caller resumes execution, all
objects accessible to the caller after the call are mapped exactly as
they were before the call. In particular, the original distribution of
arguments is restored, if necessary.

\item Exactly the same set of processors are visible to the HPF
environment before and after the subprogram call.

\end{enumerate}

\begin{implementors}

\item To ensure that all actions that logically precede the call are
completed, multiple processors may need to be synchronized before the
call is made.

\item If a variable accessible to the called routine has a replicated
representation, then all copies may need to be updated prior to the
call to contain the correct current value according to the sequential
semantics of the source program.

\item Replicated variables, if updated in the procedure, must be
updated consistently. More precisely, if a variable accessible to a
procedure has a replicated representation and is updated by (one or
more copies of) the procedure, then all copies of the replicated
variable must have identical values when the last processor returns
from the local procedure.

An implementation might check, before returning from the local
subprogram, to make sure that replicated variables have been updated
consistently by the subprogram. However, there is certainly no
requirement---perhaps not even any encouragement---to do so. This is
merely a tradeoff between speed and, for instance, debuggability.

Note that, as with a global HPF subprogram, actual arguments may be
copied or remapped in any way, so long as the effect is undone on
return from the subprogram.

\item To ensure that all actions of the procedure logically complete
before execution in the caller is resumed, multiple processors may need
to be synchronized after the call.

\end{implementors}

\section{Examples of Extrinsic Procedures}

Consider:

\CODE
PROGRAM DUMPLING
  INTERFACE
    EXTRINSIC(HPF_LOCAL) SUBROUTINE GNOCCHI(P, L, X)
      INTERFACE
        SUBROUTINE P(Q)
          REAL Q
        END SUBROUTINE P
        EXTRINSIC(COBOL_LOCAL) SUBROUTINE L(R)
          REAL R(:,:)
        END SUBROUTINE L
      END INTERFACE
      REAL X(:)
    END SUBROUTINE GNOCCHI
    EXTRINSIC(HPF_LOCAL) SUBROUTINE POTSTICKER(Q)
      REAL Q
    END SUBROUTINE POTSTICKER
    EXTRINSIC(COBOL_LOCAL) SUBROUTINE LEBERKNOEDEL(R)
      REAL R(:,:)
    END SUBROUTINE LEBERKNOEDEL
  END INTERFACE
  ...
  CALL GNOCCHI(POTSTICKER, LEBERKNOEDEL, (/ 1.2, 3.4, 5.6 /) )
  ...
END PROGRAM DUMPLING
\EDOC

The main program, {\tt DUMPLING}, when compiled by an HPF compiler, is
implicitly of extrinsic kind {\tt HPF}. Interfaces are declared to
three external subroutines {\tt GNOCCHI}, {\tt POTSTICKER}, and {\tt
LEBERKNOEDEL}. The first two are of extrinsic kind {\tt HPF_LOCAL} and
the third is of kind {\tt COBOL_LOCAL}. Now {\tt GNOCCHI} accepts two
dummy procedure arguments and so interfaces must be declared for those.
Because no {\it extrinsic-prefix} is given for dummy argument {\tt P},
its extrinsic kind is that of its host scoping unit, the declaration of
subroutine {\tt GNOCCHI}, which has extrinsic kind {\tt HPF_LOCAL}. The
declaration of the corresponding actual argument {\tt POTSTICKER} needs
to have an explicit {\it extrinsic-prefix} because its host scoping
unit is program {\tt DUMPLING}, of extrinsic kind {\tt HPF}.

As a second example, consider:

\CODE
  INTERFACE
    EXTRINSIC(HPF_LOCAL) FUNCTION BAGEL(X)
      REAL X(:)
      REAL BAGEL(100)
        !HPF$ DISTRIBUTE (CYCLIC) :: X, BAGEL
    END FUNCTION
  END INTERFACE

INTERFACE OPERATOR (+)
    EXTRINSIC(C_LOCAL) FUNCTION LATKES(X, Y) RESULT(Z)
      REAL, DIMENSION(:,:), INTENT(IN) :: X
      REAL, DIMENSION(SIZE(X,1), SIZE(X,2)), INTENT(IN) :: Y
      REAL, DIMENSION(SIZE(X,1), SIZE(X,2)) :: Z
        !HPF$ ALIGN WITH X :: Y, Z
        !HPF$ DISTRIBUTE (BLOCK, BLOCK) X
    END FUNCTION
  END INTERFACE

  INTERFACE KNISH
    FUNCTION RKNISH(X)                      !normal HPF interface
      REAL X(:), RKNISH
    END RKNISH
    EXTRINSIC(SISAL) FUNCTION CKNISH(X)     !extrinsic interface
      COMPLEX X(:), CKNISH
    END CKNISH
  END INTERFACE
\EDOC

In the last interface block, two external procedures, one of them
extrinsic and one not, are associated with the same generic procedure
name, which returns a scalar of the same type as its array argument.

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Tue Sep 10 09:52:55 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id JAA12430 for hpff-doc-out; Tue, 10 Sep 1996 09:52:55 -0500 (CDT)
Received: from mail12.digital.com (mail12.digital.com [192.208.46.20]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id JAA12425; Tue, 10 Sep 1996 09:52:48 -0500 (CDT)
Received: from mpsg.hpc.pko.dec.com by mail12.digital.com (8.7.5/UNX 1.2/1.0/WV)
	id KAA02415; Tue, 10 Sep 1996 10:41:14 -0400 (EDT)
Received: by mpsg.hpc.pko.dec.com; id AA14296; Tue, 10 Sep 1996 10:43:21 -0400
From: offner@hpc.pko.dec.com (Carl Offner)
Received: by hardy.hpc.pko.dec.com; (5.65v3.2/1.1.8.2/01Nov94-0839AM)
	id AA14985; Tue, 10 Sep 1996 10:41:11 -0400
Date: Tue, 10 Sep 1996 10:41:11 -0400
Message-Id: <9609101441.AA14985@hardy.hpc.pko.dec.com>
To: chk@cs.rice.edu, meltzer@cray.com
Subject: hpff-doc: Comments on Portable/Efficient
Cc: hpff-doc@cs.rice.edu
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

Andy and Chuck--

	Some comments on the latest version that I have of the "Coding
for Portable Performance in HPF" chapter:

		--Carl

-----------------------------------------------------------------------
**References to the "HPF Kernel" should be deleted, and the wording in
general changed from forms such as "X is included" to "X is
recommended".

**PURE Functions and Subroutines: Even better is using only pure
functions that do not access non-local data.

**Pointers:  in HPF 2.0, pointers can't point to mapped objects in any
case.

**INHERIT:  better to say that in general the INHERIT directive is NOT
recommended.  The only real exceptions are when writing library
routines, in which case the user would presumably write code that
dispatches on the distribution.  This is a real issue, because many
users will otherwise feel that INHERIT is an efficient construct and
avoid thinking about their distributions by simply slapping INHERIT
everywhere they can -- I have seen this happen several times.

**Subroutine interfaces:  In view of what is now in HPF 2.0, much of
this is unnecessary -- the restrictions are already part of the
language.  Also, assumed-shape arguments require an explicit interface
by the rules of Fortran anyway.

**DYNAMIC -- not in HPF 2.0

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri Sep 13 04:03:13 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id EAA22407 for hpff-doc-out; Fri, 13 Sep 1996 04:03:13 -0500 (CDT)
Received: from mail.gmd.de (mail.gmd.de [129.26.8.90]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id EAA22401 for <hpff-doc@cs.rice.edu>; Fri, 13 Sep 1996 04:03:01 -0500 (CDT)
Received: from fichte by mail.gmd.de with SMTP id AA30424
  (5.67b8/IDA-1.5 for <hpff-doc@cs.rice.edu>); Fri, 13 Sep 1996 11:02:41 +0200
Received: by fichte id AA16718
  (5.67b/IDA-1.5); Fri, 13 Sep 1996 11:02:12 +0200
Date: Fri, 13 Sep 1996 11:02:12 +0200
From: Thomas Brandes <Thomas.Brandes@gmd.de>
Message-Id: <199609130902.AA16718@fichte>
To: pm@icase.edu, offner@hpc.pko.dec.com, guy.steele@east.sun.com,
        hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on Mapping
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Md5: 6HD9c4iOkzKhjJH4Z7aa9w==
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
In general: this chapter has been improved dramatically and is more clean
            than ever before (thanks for that).

Some smaller comments:

a) syntax

The syntax of DISTRIBUTE allows:

!HPF$ DISTRIBUTE ONTO P :: D1, D2, D3
!HPF$ DISTRIBUTE ONTO P :: D1

but not (is this intention?)

!HPF$ DISTRIBUTE D1 ONTO P

This could  be changed by replacing

<dist-directive-stuff>  :: <dist-format-clause> [ <dist-onto-clause> ]

<dist-attribute-stuff>  :: <dist-directive-stuff> |
                           <dist-onto-clause>

with:

<dist-directive-stuff>  :: <dist-format-clause> [ <dist-onto-clause> ]
                           <dist-onto-clause>

<dist-attribute-stuff>  :: <dist-directive-stuff> 

b) DYNAMIC

On page 30, line 18 there is still DYNAMIC

c) Inconsistency between 2.4 and 7.3

   Section 2.3 (page 24, line 21-30) says

         REAL, DIMENSION(1000) :: LINUS, LUCY
   !HPF$ DISTRIBUTE (BLOCK) :: LINUS, LUCY

   LUCY and LINUS do not have necessary the same mapping because they
   might, dependending on the implementation, be distributed onto 
   differently chosen processor arrangements; so corresponding elements
   of LUCY and LINUS might not reside on the same abstract processor.

   Section 7.3 (page 136) says:

   If an ONTO clause is not specified, a default arrangement is provided
   which is identical for arrays that have identical shapes, bounds, and
   identical explicit mapping directives.

   What is correct ?  The next point definitively argues for the second
   solution.

d) Distribution of global arrays

One point is not quite clear for me. The standard allows:

      SUBROUTINE SUB1 (...)
      COMMON /DATA/ A(N)
!HPF$ DISTRIBUTE A (BLOCK)
      ...
      END 

      SUBROUTINE SUB2 (...)
      COMMON /DATA/ A(N)
!HPF$ DISTRIBUTE A (BLOCK)
      ...
      END 

The distribution is incomplete and the compiler can choice the processor
array to which it will map the array A.

How I can be sure that for separate compilation the compiler will chose
the same distribution? The same problem might come up with distributions
of arrays in MODULEs.

The current definition implies that the processor arrays must be provided.

e) Underspecified mappings

There is now something like underspecified mappings, like the following ones:

!HPF$ DISTRIBUTE ONTO P :: A
!HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A

It should be pointed out more clearly (probably an own subsection).

Furthermore, it must be made clear that underspecified mappings are not
useful (or better not allowed) for global arrays.

The semantic of underspecified mappings for dummy arrays is not quite clear.

!HPF$ DISTRIBUTE ONTO *P :: A
!HPF$ DISTRIBUTE ONTO P :: A
!HPF$ DISTRIBUTE * ONTO P :: A

Is this all the same ? 

f) New Proposal 

With a further e-mail I will put forward a proposal for Mapping  and 
Mapping in Subprogram Calls. Thought it is not full detailed, it might
contain some ideas which could be taken over.



---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri Sep 13 04:06:16 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id EAA22440 for hpff-doc-out; Fri, 13 Sep 1996 04:06:16 -0500 (CDT)
Received: from mail.gmd.de (mail.gmd.de [129.26.8.90]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id EAA22435 for <hpff-doc@cs.rice.edu>; Fri, 13 Sep 1996 04:06:10 -0500 (CDT)
Received: from fichte by mail.gmd.de with SMTP id AA30541
  (5.67b8/IDA-1.5 for <hpff-doc@cs.rice.edu>); Fri, 13 Sep 1996 11:06:06 +0200
Received: by fichte id AA16725
  (5.67b/IDA-1.5); Fri, 13 Sep 1996 11:05:39 +0200
Date: Fri, 13 Sep 1996 11:05:39 +0200
From: Thomas Brandes <Thomas.Brandes@gmd.de>
Message-Id: <199609130905.AA16725@fichte>
To: pm@icase.edu, offner@hpc.pko.dec.com, guy.steele@east.sun.com,
        hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on Mapping in Subprogram Calls
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Md5: zzI7b3a6NVPW+GIPenRAIg==
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

I am not happy with this chapter at all. The understanding of all this
stuff was not really easy, but now the confusion is complete.

a) PRESCRIPTIVE / DESCRIPTIVE

   Section 3.1, p. 43,  says:

   ..., it is fair to say that the descriptive syntax is at this point
   maintained only for backwards compatibility.

   Does this imply that this is all the same (oh please let it be) as
   far as INTERFACE blocks are provided ?

   My understanding says : yes

   !HPF$ DISTRIBUTE (BLOCK,BLOCK) ONTO P :: A
   !HPF$ DISTRIBUTE *(BLOCK,BLOCK) ONTO P :: A
   !HPF$ DISTRIBUTE (BLOCK,BLOCK) ONTO *P :: A
   !HPF$ DISTRIBUTE *(BLOCK,BLOCK) ONTO *P :: A

   I think that there was only a difference for the implementors. For
   prescriptive distributions the called routine will make the 
   redistribution, for descriptive distributions the calling routine
   must do it (and therefore INTERFACE blocks were required).

   As INTERFACES are now always required, the implementors can choose
   (if they have really supported both possibilities). But I think
   there should be later a compiler flag and no longer this confusing
   syntax.

b) Underspecified Distributions

      REAL D1(N,N), D2(N,N)
!HPF$ DISTRIBUTE D1(BLOCK,CYCLIC) ONTO P1
!HPF$ DISTRIBUTE D2(BLOCK,CYCLIC) ONTO P2
      call SUB (D1, N)
      call SUB (D2, N)
  
The subroutine SUB should be written in such a way that it can deal
with both mappings without redistributions. A first attempt would be:

      SUBROUTINE SUB (A, N)
      REAL A(N,N)
!HPF$ DISTRIBUTE A(BLOCK,BLOCK)

The distribution of A is underspecified. The compiler might choose a
processor array or will take the default processor arrangement. 
This implies that there might be redistributions for
both arrays, D1 and D2. So the following one would be more correct:

      SUBROUTINE SUB (A, N)
      REAL A(N,N)
!HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO *

Here, the compiler must generate code that can deal with every processor array.
No redistributions will be necessary.

I think that this should be pointed out more clearly. Or has underspecified
mapping another meaning for dummy arguments?

Same issue (my understanding says there is a difference between the 
following directives as in the first two cases the compiler can chosse
and in the other two cases the compiler has to accept everything):

   !HPF$ DISTRIBUTE ONTO P
   !HPF$ DISTRIBUTE ONTO *P
   !HPF$ DISTRIBUTE * ONTO P
   !HPF$ DISTRIBUTE * ONTO *P

Who can explain this stuff to an user ? 

c) Confusion about transcriptive distribution and transcriptive alignment

   !HPF$ TEMPLATE T(N)
   !HPF$ DISTRIBUTE T(CYCLIC(M))
         REAL A(N)
   !HPF$ ALIGN A(I) WITH T(a*I+b)
         CALL SUB (A)

         SUBROUTINE SUB (X)
         REAL X(:)
   !HPF$ DISTRIBUTE X * ONTO *

   The subroutine specifies a transcriptive distribution. The actual argument
   is aligned (and not distributed).

   Have I to expect a redistribution ?

   Page 45, line 33 says: The mapping should not be changed !

   This will imply that the compiler has to treat it like 'INHERITED X'.

d) aligned actual array, distributed dummy (compatibility of mappings)

   Writing subroutines with distribute directives for the dummies seems
   to be more general as the code will be more flexible.

   But what happens if the actual array is aligned. 

   Alignment allows permutation, embedding, replication. 

   1. permuted dimensions
   
   !HPF$ TEMPLATE T(N,N)
   !HPF$ DISTRIBUTE T(BLOCK,CYCLIC)
         REAL A(N,N)
   !HPF$ ALIGN A(I,J) WITH T(J,I)
         call SUB (A,N)

         subroutine SUB (A, N)
         REAL A(N,N)
   !HPF$ DISTRIBUTE A(CYCLIC,BLOCK) ONTO *

   2. replicated dimensions:

   !HPF$ TEMPLATE T(N,N)
   !HPF$ DISTRIBUTE T(BLOCK,CYCLIC)
         REAL A(N)
   !HPF$ ALIGN A(I) WITH T(I,*)
         call SUB (A,N)

         subroutine SUB (A, N)
         REAL A(N)
   !HPF$ DISTRIBUTE A(CYCLIC) ONTO *

   3. embedding

   !HPF$ TEMPLATE T(N,N)
   !HPF$ DISTRIBUTE T(BLOCK,CYCLIC)
         REAL A(N)
   !HPF$ ALIGN A(I) WITH T(I,2)
         call SUB (A,N)

         subroutine SUB (A, N)
         REAL A(N)
   !HPF$ DISTRIBUTE A(CYCLIC) ONTO *

   Must I assume redistributions in any case ? What would do a good
   compiler ? 

   In other words: there should be more clarity about compatible 
   mappings. 

e) Proposal

   With a further e-mail I will put forward a proposal for Mapping  and
   Mapping in Subprogram Calls. Thought it is not full detailed, it might
   contain some ideas which could be taken over.


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri Sep 13 04:14:55 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id EAA22503 for hpff-doc-out; Fri, 13 Sep 1996 04:14:55 -0500 (CDT)
Received: from mail.gmd.de (mail.gmd.de [129.26.8.90]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id EAA22498 for <hpff-doc@cs.rice.edu>; Fri, 13 Sep 1996 04:14:49 -0500 (CDT)
Received: from fichte by mail.gmd.de with SMTP id AA30184
  (5.67b8/IDA-1.5 for <hpff-doc@cs.rice.edu>); Fri, 13 Sep 1996 11:14:37 +0200
Received: by fichte id AA16740
  (5.67b/IDA-1.5); Fri, 13 Sep 1996 11:14:09 +0200
Date: Fri, 13 Sep 1996 11:14:09 +0200
From: Thomas Brandes <Thomas.Brandes@gmd.de>
Message-Id: <199609130914.AA16740@fichte>
To: m@icase.edu, offner@hpc.pko.dec.com, guy.steele@east.sun.com,
        hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on Mapping, Mapping in Subprogram Calls
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Md5: dt9skpQPAUWCdLOt5sW/mQ==
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

Proposal for Mapping / Mapping in Subprogram Calls
==================================================

A) DESCRIPTIVE / PRESCRIPTIVE / TRANSCRIPTIVE

   Make no longer any difference between descriptive, prescriptive and
   transcriptive !! I doubt that any user can really understand it.
   I have spent a lot of time to understand it. Sometimes I think I have
   but some days later I am again confused.

   Many confusion is due to the fact that mappings can be underspecified.

   Furthermore, the confusion becomes more dramatical with the new 
   proposed RANGE directive that does not go conform with the descriptive
   directives.

   Please do not keep DESCRIPTIVE/PRESCRIPTIVE for history! It is a bad one!

B) FULL SPECIFIED / UNDERSPECIFIED MAPPINGS

   Instead of this make a difference between full specified and
   underspecified mappings. This can be defined as follows:

   - an explicit distribution with ONTO clause (where the processor array
     is specified) is a full specified mapping

   - an explicitly aligned array to a distributee that has a full specified
     mapping has as also a full specfied mapping

   Examples of underspecified mappings:

   !HPF$ DISTRIBUTE A(BLOCK,*) ONTO *
   !HPF$ DISTRIBUTE A(ANY,ANY) ONTO P
   !HPF$ DISTRIBUTE A ONTO P
   !HPF$ DISTRIBUTE A ONTO *

   A full specified mapping is given if there is no choice for the
   compiler about the actual mapping.

   Critical Point (open for discussion, but my view is clear):

   A missing ONTO clause implies a default arrangement. But this
   arrangement is (or not?) identical for distributees that have
   identical shapes and identical explicit mappings.

   !HPF$ DISTRIBUTE A(BLOCK,BLOCK)

   As there is obviously no choice for the compiler, this is also
   a full specified mapping. This is especially useful to have
   full specified mappings without defining processor arrays (that
   cannot be passed to subroutines).

   Attention: the following directives are different:

   !HPF$ DISTRIBUTE A(BLOCK,BLOCK)
   !HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO *

C) FULL SPECIFIED MAPPINGS

   A full specified mapping is absolutely necessaray for global arrays
   (mapped arrays in common blocks and in modules). Otherwise the compiler
   might choose different distributions in different compilation units.

      SUBROUTINE SUB1 (...)
      COMMON /DATA/ A(1000,1000)
!HPF$ DISTRIBUTE A(BLOCK,BLOCK)

      SUBROUTINE SUB2 (...)
      COMMON /DATA/ A(1000,1000)
!HPF$ DISTRIBUTE A(BLOCK,BLOCK)

   The following code would be non-conformant:

      SUBROUTINE SUB1 (...)
      COMMON /DATA/ A(1000,1000)
!HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO *

      SUBROUTINE SUB2 (...)
      COMMON /DATA/ A(1000,1000)
!HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO *

D) SEMANTIC of UNDERSPECIFIED MAPPINGs

   For local arrays with underspecfied mappings the compiler might choose
   the final mapping that has to satisfy the underspecified mapping.
   The final mapping must be compatible with the underspecified mapping.

   For dummy arrays the compiler will generate code that can deal with
   all mappings of the actuals that satisfy the underspefied mapping.
   But the actual must be compatible with the dummy mapping (otherwise
   the data is redistributed).

   For pointers with underspecified mappings the compiler can choose
   the final mapping in an ALLOCATE statement. A pointer can be associated
   with a target if the mapping of the target is compatible.


   Example: !HPF$ DISTRIBUTE C (CYCLIC,BLOCK)

   This works also fine when providing INTERFACE blocks for remappings
   at subroutine boundaries where the compiler might the chose the less
   expensive redistribution.

F) Redistributions at Subroutine Boundaries

   A redistribution is required if the actual argument is not compatible
   with the dummy argument. In this case, an explicit INTERFACE must
   exist.

E) Extensions for underspecified mappings

   For dummy arrays there should be obviously some more general possibilities
   to underspecify the distribution.

   I propose to extend the distributions in such a way that it goes 
   conform with the new proposed RANGE DIRECTIVE.

   dummy_dist_format :: BLOCK [ (int-expr) ]
                        CYCLIC [ (int-expr) ]
                        GEN_BLOCK (int-array)
                        INDIRECT (int-array)
                        *
                        BLOCK ()
                        CYCLIC ()
                        GEN_BLOCK
                        INDIRECT 
                        ALL

   The new forms are used to underspecify the actual distribution.

   For alignments that have strides and offsets another syntax is
   introduced:

   extended_dummy_dist_format : dummy_dist_format
                                ':' dummy_dist_format
                                
F) Compatibily of Mappings

   a) compatibility with processor arrays

   !HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO P

   If the processor array is full specified, it implies that the first
   dimension of A is distributed along the first dimension of processor
   array P and the second one along the second dimension of P.

   !HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO *P

   This implies that array A is mapped to processor array P (only
   these processors own values of A). But in fact, it can also have
   embedded, replicated or permuted dimensions.

   !HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO *

   A is mapped to some processor array. Embedding, replication
   or permutation of dimensions can be possible.

   !HPF$ TEMPLATE T(N,N)
   !HPF$ DISTRIBUTE T(BLOCK,BLOCK) ONTO P
         REAL A(N,N), B(N,N)
   !HPF$ ALIGN A(I,J) WITH T(J,I)
   !HPF$ ALIGN B(I,J) WITH T(I,J)

   The mapping of A and B are both compatible with

   !HPF$ DISTRIBUTE (BLOCK,BLOCK) ONTO *
   !HPF$ DISTRIBUTE (BLOCK,BLOCK) ONTO *P

   But only the mapping of B is compatible with

   !HPF$ DISTRIBUTE (BLOCK,BLOCK) ONTO P

   b) compatibility of distributions

   BLOCK is compatible with BLOCK, BLOCK(), CYCLIC(), ALL
   BLOCK(m) is compatible with BLOCK(m), BLOCK(), GEN_BLOCK, ALL
   CYCLIC is compatible with CYCLIC, CYCLIC(), ALL
   CYLCIC(m) is compatible with CYCLIC(m), CYCLIC(), ALL
   GEN_BLOCK(int_array) is compatible with GEN_BLOCK, ALL
   INDIRIECT(int_array) is compatible with INDIRECT, ALL

   * is compatible with everything under the restriction that
     the processor array is not full specified.

         REAL A(N,N), B(N,N)
   !HPF$ DISTRIBUTE A(*,BLOCK) ONTO P
   !HPF$ DISTRIBUTE B(BLOCK,*) ONTO P

   A and B are both compatible with

   !HPF$ DISTRIBUTE (BLOCK,BLOCK) ONTO *P

   Note: BLOCK and BLOCK(m) are also compatible with GEN_BLOCK which
         is currently not the case for the RANGE directive (but it is
         trivial for the compiler).

   c) Aligned dimensions

   !HPF$ TEMPLATE T(N)
   !HPF$ DISTRIBUTE T(BLOCK)
         REAL A(M)
   !HPF$ ALIGN A(I) with T(a*I+b)

   The mapping of A is not compatible with

   !HPF$ DISTRIBUTE (BLOCK) ONTO *

   as the dimension of A is not really BLOCK distributed.

   Therefore we need a possibility to specfiy this as the compiler has
   to generate more general code.

   !HPF$ DISTRIBUTE A(:BLOCK)

   In general, :<dummy_dist_format> tells the compiler that the actual
   dimension is aligned in some way to a template dimension that has 
   this distribution.

   Note: compiler might take advantage of the fact that the stride will 
         be 1 (special notation ?)

   This notation can also be used if array sections are passed to
   subroutines.

         LOGICAL FRUG(128)
   !HPF$ PROCESSORS DANCE_FLOOR(16)
   !HPF$ DISTRIBUTE (BLOCK) ONTO DANCE_FLOOR :: FRUG
         CALL TERPSICHORE (FRUG(1:40:3))

         SUBROUTINE TERPSICHORE (FOXTROT)
         LOGICAL FOXTROT(:)
   !HPF$ DISTRIBUTE (:BLOCK) ONTO *

   The distribution of the actual array section is compatible with
   the underspecified distribution of the dummy.


   d) Permutations, Embeddings, Replications

   The following code is will not require any redistribution:

   !HPF$ TEMPLATE T(N,N)
   !HPF$ DISTRIBUTE T(BLOCK,CYCLIC)
         REAL A(N,N)
   !HPF$ ALIGN A(I,J) WITH T(J,I)
         call SUB (A,N)

         subroutine SUB (A, N)
         REAL A(N,N)
   !HPF$ DISTRIBUTE A(CYCLIC,BLOCK) ONTO *

   Similiar it is the case for replicated dimensions:

   !HPF$ TEMPLATE T(N,N)
   !HPF$ DISTRIBUTE T(BLOCK,CYCLIC)
         REAL A(N)
   !HPF$ ALIGN A(I) WITH T(I,*)
         call SUB (A,N)

         subroutine SUB (A, N)
         REAL A(N)
   !HPF$ DISTRIBUTE A(CYCLIC) ONTO *

   Embedded arrays can also be compatible with underspecified distributions:

   !HPF$ TEMPLATE T(N,N)
   !HPF$ DISTRIBUTE T(BLOCK,CYCLIC)
         REAL A(N)
   !HPF$ ALIGN A(I) WITH T(I,2)
         call SUB (A,N)

         subroutine SUB (A, N)
         REAL A(N)
   !HPF$ DISTRIBUTE A(CYCLIC)

   e) Compatibility of aligned arrays

         REAL B(N,N)
         REAL A(N,N)
   !HPF$ DISTRIBUTE A(BLOCK,BLOCK)
   !HPF$ ALIGN B(I,J) WITH A(I,J)
         CALL SUB (A,B)

         SUBROUTINE SUB (X,Y)
         REAL X(:,:), Y(:,:)
   !HPF$ DISTRIBUTE Y(BLOCK,BLOCK)
   !HPF$ ALIGN Y(I,J) WITH X(I,J)

   The aligned array B is compatible with the distribution of the dummy
   argument Y. The distribution of A is also compatible with the
   alignment of Y (as in this case the alignment can be inversed).

   Advice to users: It might be the case that a compiler cannot verify
   compatibility and might generate code for redistribution. But at
   runtime the compatibility guarantees that only local copy is
   required.

G) INHERIT + RANGE DIRECTIVE

   !HPF$ INHERIT X
 
   is equivalent to !HPF$ DISTRIBUTE X (:ALL, ..., :ALL) ONTO *

   !HPF$ RANGE <dummy_distribution_list>
 
   goes conform with the current solution but allows different possibilites
   instead of one single distribution directive.

H) What should be in the approved extensions ?

   Probably the new possibilities for underspecified distributions
   should be put into the approved extensions.

I) Conformance to the current standard

   !HPF$ INHERIT is equivalent to

   !HPF$ DISTRIBUTE (:ALL,...,:ALL) ONTO *

   The syntax of prescriptive directives could still be used but there
   is no semantic difference.

   !HPF$ DISTRIBUTE *(BLOCK,BLOCK) ONTO P
   !HPF$ DISTRIBUTE (BLOCK,BLOCK) ONTO P

Summary of Advantages:
======================

a) the concept is more clean and will hopefully cause less confusion
   for the user.

b) much more flexibility in writing 'efficient' code that can deal
   with different distributions, e.g.:

         SUBROUTINE SUB (A)
         REAL A(:,:)
   !HPF$ DISTRIBUTE A (:BLOCK,:BLOCK) ONTO *
         ...
         END SUBROUTINE

c) compatibility with the RANGE directive of the approved extensions

d) implementation in a compiler is not too complicated (I hope!)

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri Sep 13 04:17:19 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id EAA22617 for hpff-doc-out; Fri, 13 Sep 1996 04:17:19 -0500 (CDT)
Received: from mail.gmd.de (mail.gmd.de [129.26.8.90]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id EAA22612; Fri, 13 Sep 1996 04:17:15 -0500 (CDT)
Received: from fichte by mail.gmd.de with SMTP id AA31425
  (5.67b8/IDA-1.5); Fri, 13 Sep 1996 11:17:12 +0200
Received: by fichte id AA16746
  (5.67b/IDA-1.5); Fri, 13 Sep 1996 11:16:44 +0200
Date: Fri, 13 Sep 1996 11:16:44 +0200
From: Thomas Brandes <Thomas.Brandes@gmd.de>
Message-Id: <199609130916.AA16746@fichte>
To: meltzer@cray.com, chk@cs.rice.edu, hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on Portable/Efficient Constructs
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Md5: QVnlbLsQPw8GrD3qWHIzaA==
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

Comments on Portable/Efficient Constructs

a) Inconsistency between 2.4 and 7.3

   Section 2.3 (page 24, line 21-30) says

         REAL, DIMENSION(1000) :: LINUS, LUCY
   !HPF$ DISTRIBUTE (BLOCK) :: LINUS, LUCY

   LUCY and LINUS do not have necessary the same mapping because they
   might, dependending on the implementation, be distributed onto
   differently chosen processor arrangements; so corresponding elements
   of LUCY and LINUS might not reside on the same abstract processor.

   Section 7.3 (page 136) says:

   If an ONTO clause is not specified, a default arrangement is provided
   which is identical for arrays that have identical shapes, bounds, and
   identical explicit mapping directives.

   What is correct? I would vote for the last solution in any case.

b) 7.8 INDEPENDENT 

   How to distinguish the good loops from the bad ones?

   Good loops are:

   - scalar variables defined in the loop are either new variables or 
     reduction variables (this is probably necessary to be INDEPENDENT
     at all). The same should be also true for serial arrays, sequential
     and replicated data (because the compiler has to guarantee consistency
     at the end of the loop).

   - all elements of mapped array variables that are defined within
     the loop are aligned with each other (this implies that the compiler
     will choose a good home for every iteration). 

   !HPF$ INDEPENDENT
         DO i = 1, n
            x(i) = a(i) * a(i)
            d(i) = x(i) + c(i)
         END DO

   - the same array variable should not be defined and used in the same
     loop with different indexes

   !HPF$ INDEPENDENT 
         DO I = 1, N
            X(I) = ... 
            Y(I) = f(X(I+k))
         END DO 

     Here the compiler might have problems to extract the communication 

c) 7.14 Subroutine Interfaces

   Do you mean that this is not allowed ?

         REAL A(N)
   !HPF$ DISTRIBUTE A(CYCLIC(M))
         CALL SUB (A(2:N:2))

         SUBROUTINE SUB (X)
         REAL X(:)
   !HPF$ DISTRIBUTE X * ONTO *

   Passing array sections should probably always be avoided as they might
   require always redistributions!!


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri Sep 13 05:22:04 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id FAA24646 for hpff-doc-out; Fri, 13 Sep 1996 05:22:04 -0500 (CDT)
Received: from mail.gmd.de (mail.gmd.de [129.26.8.90]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id FAA24641 for <hpff-doc@cs.rice.edu>; Fri, 13 Sep 1996 05:22:00 -0500 (CDT)
Received: from fichte by mail.gmd.de with SMTP id AA04292
  (5.67b8/IDA-1.5 for <hpff-doc@cs.rice.edu>); Fri, 13 Sep 1996 12:21:57 +0200
Received: by fichte id AA16811
  (5.67b/IDA-1.5); Fri, 13 Sep 1996 12:21:28 +0200
Date: Fri, 13 Sep 1996 12:21:28 +0200
From: Thomas Brandes <Thomas.Brandes@gmd.de>
Message-Id: <199609131021.AA16811@fichte>
To: pm@icase.edu, offner@hpc.pko.dec.com, hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on Extended Mapping
Cc: brandes@gmd.de
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Md5: AiTDCdl4Lkh70b4viKAZcA==
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

1) RANGE DIRECTIVE (8.11)
   
Proposal: BLOCK, BLOCK(m) are also compatible with GEN_BLOCK

I believe that this will not make any difference for the compiler.

2) Shadow Width Declarations (8.12)

Sorry, I am absolutely not happy with it.

a) What about aligned arrays?

   Chapter 2 forces the user to use more the ALIGN directive. What is
   the shadow width of an aligned array ?

   If shadow edges are inherited, this will add shadow edges for a lot
   of arrays where this is not really required.

b) What about dummy arrays ?

   Can we specify shadow edges for dummy arrays ?

   This can be useful in certain cases.

c) Shadow edges for serial arrays ?

         REAL A(N)
   !HPF$ DISTRIBUTE A(*)
         A = CSHIFT (A,1) + CSHIFT(A,-1)

   A shadow edge is also useful for the array A as the code generation
   might be less complex.

   A(0) = A(N)
   A(N+1) = A(1)
   FORALL (I=1:N) A(I) = A(I-1) + A(I+1)

Proposal: insert a directive that is related to arrays instead of 
          distributions.

   !HPF$ SHADOW <shadow_decl_list>

   <shadow_decl> :: array_name '(' shadow_spec_list ')'

   <shadow_spec> :: [int_expr]
                    int_expr : int_expr
                    LOW_SHADOW = int-expr [: HIGH_SHADOW = int-expr]
                    HIGH_SHADOW = int-expr [: LOW_SHADOW = int-expr]

   Example:

   !HPF$ SHADOW A(3,1:2,HIGH_SHADOW=3)
   !HPF$ SHADOW B(,LOW_SHADOW=1)


      _______    _   _       _____   Thomas Brandes
     / _____/   / | / |     / ___ |  Forschungszentrum Informationstechnik
    / / ____   /  |/  |    / /  / /  Schloss Birlinghoven
   / / /_  /  / /   / |   / /  / /   D-53754 Sankt Augustin
  / /___/ /  / /|  /| |  / /__/ /    Tel: +49-2241-14-2492   Fax: 2181
  \______/  /_/ |_/ |_| /______/     Email: brandes@gmd.de
                                     http: //www.gmd.de/SCAI/people/brandes.html
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri Sep 13 05:24:48 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id FAA24670 for hpff-doc-out; Fri, 13 Sep 1996 05:24:48 -0500 (CDT)
Received: from mail.gmd.de (mail.gmd.de [129.26.8.90]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id FAA24665 for <hpff-doc@cs.rice.edu>; Fri, 13 Sep 1996 05:24:42 -0500 (CDT)
Received: from fichte by mail.gmd.de with SMTP id AA05255
  (5.67b8/IDA-1.5 for <hpff-doc@cs.rice.edu>); Fri, 13 Sep 1996 12:24:39 +0200
Received: by fichte id AA16817
  (5.67b/IDA-1.5); Fri, 13 Sep 1996 12:24:10 +0200
Date: Fri, 13 Sep 1996 12:24:10 +0200
From: Thomas Brandes <Thomas.Brandes@gmd.de>
Message-Id: <199609131024.AA16817@fichte>
To: baden@cs.ucsd.edu, hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on SPMD-to-HPF
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Md5: 3dVmYKaToQRIhbPZQ6SBDg==
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

Comments on SPMD-to-HPF
=======================

Though I believe that the topic is one of the most exciting features, the
current proposal is too inconsistent and still in a very early draft stage

Here are some detailed comments:

A) currently I would not like to see the strong connections to MPI.

For the example in F.4.5 I see a lot of problems:

!   Set up the process groups and communicators, one for each block

    call MPI_COMM_SPLIT(MPI_COMM_WORLD,ZERO_TO_KM1,DUMMY,COMM_A,IERROR)
    call init_HPF(COMM_A)

    call MPI_COMM_SPLIT(MPI_COMM_WORLD,K_TO_PM1,DUMMY,COMM_B,IERROR)
    call init_HPF(COMM_B)

    if (isMember(COMM_A)
!       TO_MAPPED_ARRAY( ) converts the global array reference to a
!       MAPPED_ARRAY.  k>1 arguments may correspond to TO_MAPPED_ARRAY(A)
        call alloc_mapped_arrayREAL(TO_MAPPED_ARRAY(A), 2, A_EXT, DECOMP)
    if (isMember(COMM_B)
        call alloc_mapped_arrayREAL(TO_MAPPED_ARRAY(B), 2, B_EXT, DECOMP)

How does the realization of alloc_mapped_array know on which processor
array A and B will be allocated? The communicator is not given as an
argument!

So: just propose an HPF_INIT without any arguments and separate this
    initializaton completely from using any processor arrays (in fact
    the proposed mechanisms are not quite understandable, see also below).

B) The use of macros and subroutines should be pointed out more clearly

  Macros: MAPPED_ARRAY(A)
          TO_MAPPED_ARRAY(A)

          BLOCK
          CYCLIC
          SERIAL
          CYCLIC(n)    ! attention with parameter n

  Attention: names, especially SERIAL, should be changed to e.g.
             BLOCK_DISTRIBUTION, CYCLIC_DISTRIBUTION, SERIAL_DISTRIBUTION
             
C) Subroutines (be more consistent between TTT and _TTT):

          HPF_INIT   (attention: later INIT_HPF is used)
          alloc_mapped_array_TTT
          copy_mapped_array_TTT
          copy_mapped_array_sec_TTT

Remark: you propose an optional argument ALIGN for alloc_mapped_array_TTT.
        This restricts the use to Fortran~90. Better: using two different
        routines.

   alloc_aligned_array_TTT (A, NDIM, ALIGN)

   This might also be useful as the EXTENTS must be the same in any case.

D) How works the mechanism to call HPF routines ?
          
!       These calls to HPF execute in different processor groups; their
!       respective arguments were allocated with different communicators

        if (isMember(COMM_A))
            residA = smooth(A)

        if (isMember(COMM_B))
            residB = smooth(B)

Obviously there is never a definition of A and B. How can they be passed
to the routine smooth? Must 'smooth' be defined as an HPF routine.

E) What about using existing functionality ?

The proposal says that 

   GLOBAL_ALIGNMENT, ...., MY_PROCESSOR are incldued.

But: these routines rely on Fortran 90. What can be used in other
     languages like Fortran 77, C, C++ ? 

Idea: There are Fortran 77 versions of these routines. Probably it might
      be possible to find versions that seem appropriate for every language.

F) How to access distributed arrays in the SPMD program ?

Another important issue is that it is absolutely unclear how to access
the local parts of the arrays A and B within the SPMD program. How
can they be initialized ? Can they only be accessed via HPF programs ?

The routines copy_mapped_array_<sec_>TTT (...) will work only on 
mapped arrays. How can interact mapped arrays with other data in the SPMD
program?

G) Interactions between FORTRAN_LOCAL routines 

As calling global HPF from SPMD seems to be similiar to calling F77 code from 
global HPF, there should be some connections that should be pointed out
clearly.

PROPOSAL:
=========

The following proposal describes how to call SPMD code from HPF and vice
versa.

The essential ideas are:

  - Macros are used to define the interface between global HPF code
    and SPMD code to overcome implementation-specific rules for 
    calling conventions. These macros are used for SPMD subroutines that
    are called from HPF and for calls of HPF programs from SPMD code.

  - Macros are used to define/access data structures used for mapped arrays
    within SPMD code

I. Writing Local Routines

A local routine can be invoked from a global HPF program.

       SUBROUTINE DOIT (A)
       REAL, A(:,:)
!HPF$  DISTRIBUTE A(BLOCK,BLOCK)
       INTERFACE
          EXTRINSIC(HPF_LOCAL) SUBROUTINE INIT (A, S)
          REAL A(:,:), S
!HPF$     DISTRIBUTE A(BLOCK,BLOCK)
          END SUBROUTINE INIT
       END INTERFACE
       ...
       call INIT (A,1.0)
       ...
       END

Hint: An interface is required if there is any implicit redistribution at 
the subroutine boundary.

The code realizing INIT can be any SPMD code (also in other languages)
that has to fit the calling conventions.

II. Writing Local SPMD Code in FORTRAN 77

If the local code is not written in HPF, there must be some conventions
how the local code is called from global routines. This is obviously
implementation-specific.

One solution could be to call local code written in other languages by
some new rules (other extrinsic models). The main problem is here how
to get access to the desriptors of mapped data.

An SPMD program that is called from an HPF program has to be defined
in the following way:

      SUBROUTINE INIT HPFM_ARGS(`HPFM_IS_MAPPED(V),HPFM_IS_SEQ(S)')
      HPFM_ENVIRONMENT
      HPFM_DECLARE_MAPPED_ARRAY(V,2,REAL)
      HPFM_DECLARE_SEQ(S)
      REAL S

For the interface the following macros will be used:

  HPFM_ARGS      : has to procede the dummy argument list.
  HFPM_IS_MAPPED : has to procede every argument that corresponds to 
                   a mapped array.
  HPFM_IS_SEQ    : has to procede every argument that corresponds
                   to sequential data, especially scalar values.

  HPFM_ENVIRONMENT : is used to insert some compiler specific definitions 
                     (might be empty).

  HPFM_DECLARE_MAPPED_ARRAY: is used to declare the dummy arguments that 
                             will be used for passing HPF distributed arrays.
  HPFM_DECLARE_SEQ : is used to declare the dummy arguments
      that might additionally be used for passing sequential data. The 
      sequential data has in any case to be defined as a usual argument.

This mechanism guarantees that for calling INIT in an HPF routine it
can be considered like calling a usual HPF routine. An interface is not
required unless other HPF conventions require this (e.g. if a redistribution
at the subroutine boundary is required).

Note:

    There are approved extensions for FORTRAN\_LOCAL routines that
    require an explicit interface in the HPF program. By the interface,
    the HPF compiler knows whether it has to pass a pointer or a descriptor
    of a mapped array. This approach is more general.

III. Calling Global HPF Routines from SPMD Programs

HPF invocations will import scalars and global arrays allocated from
the SMPD  program. The primary concern here in how to
deal with global arrays.

      call sub HPFM_ARGS(`HPFM_MAPPED(V), HPFM_SEQ(S)')

For calling global HPF routines the following macros will be used:

  HPFM_ARGS:   has to procede the dummy argument list.
  HPFM_MAPPED: has to procede every actual argument that
               corresponds to a mapped array.
  HPFM_SEQ:    has to procede every actual argument that
               corresponds to sequential data, especially if 
               scalar values are passed.

IV. Mapped Arrays

Until now, the SPMD program can access only the data passed by global
HPF routines via arguments.

Now, an SPMD program can also define mapped arrays and 
mapped data. These mapped arrays can later be passed to global
HPF routines. Currently, the possibilities for defining mapped
arrays are restricted, especially the possibilities for alignment
are restricted.

Mapped arrays are defined via macros (implementation-specific data
structures for mapped data). Data can be allocated via special
subroutines (this is as Appendix F of HPF2.0 currently proposes, but
some cleanings might be required).

      HPFM_ENVIRONMENT
      HPFM_DECLARE_MAPPED_ARRAY(V,2,REAL)

      INTEGER EXT(4), DECOMP(2)

      EXT = (/1,100,1,100/)
      DECOMP = (/HPFM_BLOCK_DISTRIBUTION, HPFM_SERIAL_DISTRIBUTION/)

      ! HPFM_MAPPED_ARRAY(V) finds correct arguments for passing V

      call alloc_mapped_array (HPFM_MAPPED_ARRAY(V), 2, EXT, DECOMP)

Syntax/Semantic of alloc_mapped_array and alloc_aligned_array have
still to be defined more clearly.

V. Sharing Data between Global and Local Routines

Obviously one can access in local routines only the data that has been
passed by arguments. There is no possibility to access any global data.

This restriction might be very restrictive especially for porting existing
FORTRAN 77 applications. Therefore we propose that global and local routines
can share data of a COMMON block as long as the data is not mapped.

       SUBROUTINE DOIT (A)
       REAL, A(:,:)
!HPF$  DISTRIBUTE A(BLOCK,BLOCK)
       COMMON /DATA/ A1, A2
       REAL A1, A2
       ...
       call INIT (A,1.0)
       ...
       END

       EXTRINSIC(HPF_LOCAL) SUBROUTINE INIT (A,S)
       REAL A(:,:), S
!HPF$  DISTRIBUTE A(BLOCK,BLOCK)
       COMMON /DATA/ A1, A2
       REAL A1, A2
       A = S
       END SUBROUTINE INIT

Furthermore, we propose that distributed arrays of COMMON blocks
can also be accessed via the same mechanism as dummy arguments.

!     global HPF code

      SUBROUTINE DOIT ()
      COMMON /DATA/ A(N,N)
!HPF$ DISTRIBUTE A(BLOCK,BLOCK)
      ...
      call INIT ()
      ...
      END

!     local SPMD code, here in FORTRAN 77

      SUBROUTINE INIT HPFM_ARGS(`')

      COMMON /DATA/ HPFM_MAPPED_COMMON(A)
      HPFM_DECLARE_MAPPED_ARRAY(A,2,REAL)
      ...
      END 

VI. Access of Mapped Data in SPMD Code

A new macro allows to access the first local element of a mapped array.

     HPFM_MAPPED_FIRST(v)

This macro can be used in the following way:

      HPFM_DECLARE_MAPPED_ARRAY(V,2,REAL)
      ... ! V is allocated or dummy argument
      CALL SPMD_SHAPE (SHP, HPFM_MAPPED_ARRAY(V))
      ! local shape is SHP(1) x SHP(2)
      CALL DO (HPFM_MAPPED_FIRST(V), SHP(1), SHP(2), ..)

      SUBROUTINE DO_IT (V,N1,N2,...)
      INTEGER N1, N2
      REAL V(N1,N2), S
      DO J=1,N2
         DO I=1,N1
            V(I,J) = ...
         END DO
      END DO
      END

VII. Runtime Support

Change F77 routines to SPMD routines.

   SPMD_GLOBAL, SPMD_GLOBAL_DISTRIBUTION, SPMD_GLOBAL_TEMPLATE

   SPMD_ABSTRACT_TO_PHYSICAL, SPMD_PHYSICAL_TO_ABSTRACT,
   SPMD_LOCAL_TO_GLOBAL, SPMD_GLOBAL_TO_LOCAL

   SPMD_LOCAL_BLKCNT, SPMD_LOCAL_LINDEX, SPMD_LOCAL_UINDEX

   SPMD_GLOBAL_SHAPE, SPMD_GLOBAL_SIZE

   SPMD_SHAPE, SPMD_SIZE

   SPMD_MY_PROCESSOR

   SPMD_SUBGRID_INFO

Probably they can be called from Fortran 90, FORTRAN 77, C, and C++ programs.

VIII. HPF Initialization

If the main program is an SPMD progran and not an HPF program
a call to a procedure HPF_INIT is required to establish
various state needed by the HPF run time system, that will, for
example, enable the processor to determine `who it is.'

     PROGRAM SPMD_TEST
     ...
     CALL HPF_INIT ()
     ...
     CALL HPF_EXIT ()
     END PROGRAM

IX. Example of Mixed Programming

      PROGRAM TEST

      ! defintions used within other macros

      HPFM_ENVIRONMENT

      ! define data structures for a distributed array

      HPFM_DECLARE_MAPPED_ARRAY(V,2,REAL)

      REAL S

      INTEGER N

      INTEGER EXT(4), DECOMP(2)

      ! initializes HPF environment (is a subroutine)

      call HPF_INIT()

      EXT = (/1,100,1,100/)
      DECOMP = (/HPFM_BLOCK_DISTRIBUTION, HPFM_SERIAL_DISTRIBUTION/)

      ! HPFM_MAPPED_ARRAY(V) finds correct arguments for passing V

      call alloc_mapped_array_REAL (HPFM_MAPPED_ARRAY(V), 2, EXT, DECOMP)

      ! arguments to an HPF routine are specially handled
      ! HPFM_CALL_ARGS(...) translates argument list correctly
      ! HPFM_MAPPED(V) if V is a HPF mapped array
      ! HPFM_SEQUENTIAL(S) if S is a usual Fortran argument

      call sub HPFM_ARGS(`HPFM_MAPPED(V), HPFM_SEQ(S)')

      print *, 'S = ', S

      ! terminate HPF execution correctly (is subroutine)

      call HPF_EXIT ()

      END

      ! HPF routine compiled by the HPF compiler

      subroutine sub (V, SUM1)

      real V(:,:)
      real SUM1
!hpf$ distribute V (Block,*)

      ! here we call a routine that will be realized in Fortran 77

      call init (V,3.0)

      SUM1 = sum(V)
      end

/* SPMD program written in C */

      void init HPFM_ARGS(`HPFM_IS_MAPPED(v),HPFM_IS_SEQ(s)')
      HPFM_DECLARE_MAPPED_ARRAY(v,2,float)
      HPFM_DECLARE_SEQ(s)
      float s;

      { int shp[2];
        float *ptr;

        spmd_shape (shp, HPFM_MAPPED_ARRAY(v))
        ptr = HPFM_MAPPED_FIRST(v);

        for (j=0; j<n2; j++)
          for (i=0; i<n1; i++)
             *ptr++ = s;

      } /* init */

      _______    _   _       _____   Thomas Brandes
     / _____/   / | / |     / ___ |  Forschungszentrum Informationstechnik
    / / ____   /  |/  |    / /  / /  Schloss Birlinghoven
   / / /_  /  / /   / |   / /  / /   D-53754 Sankt Augustin
  / /___/ /  / /|  /| |  / /__/ /    Tel: +49-2241-14-2492   Fax: 2181
  \______/  /_/ |_/ |_| /______/     Email: brandes@gmd.de
                                     http: //www.gmd.de/SCAI/people/brandes.html

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri Sep 13 08:02:35 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id IAA26100 for hpff-doc-out; Fri, 13 Sep 1996 08:02:35 -0500 (CDT)
Received: from mail.gmd.de (mail.gmd.de [129.26.8.90]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id IAA26094 for <hpff-doc@cs.rice.edu>; Fri, 13 Sep 1996 08:02:29 -0500 (CDT)
Received: from fichte by mail.gmd.de with SMTP id AA19026
  (5.67b8/IDA-1.5 for <hpff-doc@cs.rice.edu>); Fri, 13 Sep 1996 15:02:12 +0200
Received: by fichte id AA18516
  (5.67b/IDA-1.5); Fri, 13 Sep 1996 15:01:41 +0200
Date: Fri, 13 Sep 1996 15:01:41 +0200
From: Thomas Brandes <Thomas.Brandes@gmd.de>
Message-Id: <199609131301.AA18516@fichte>
To: pm@icase.edu, offner@hpc.pko.dec.com, hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on Extended Mapping
Cc: Thomas.Brandes@gmd.de
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Md5: Bg+tyHnuIwioANliL7uSIA==
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
8.11 Range Directive, Synatx 
============================

In all the examples the 'ranger' is missing.

!HPF$ RANGE X (BLOCK,*), (CYCLIC,*)

Or can we combine INHERIT and RANGE directive (might be useful!) ?

!HPF$ INHERIT X, Y
!HPF$ RANGE (BLOCK,*), (CYCLIC,*)

Supported is only:

!HPF$ INHERIT, RANGE (BLOCK,*), (CYCLIC,*) :: X, Y

Thomas Brandes
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Fri Sep 13 14:18:30 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id OAA13036 for hpff-doc-out; Fri, 13 Sep 1996 14:18:30 -0500 (CDT)
Received: from VNET.IBM.COM (vnet.ibm.com [199.171.26.4]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id OAA13017 for <hpff-doc@cs.rice.edu>; Fri, 13 Sep 1996 14:18:23 -0500 (CDT)
Message-Id: <199609131918.OAA13017@cs.rice.edu>
Received: from TOROLAB2 by VNET.IBM.COM (IBM VM SMTP V2R3) with BSMTP id 4483;
   Fri, 13 Sep 96 15:18:21 EDT
Date: Fri, 13 Sep 96 15:16:59 EDT
From: "Wai Ming Wong" <wmwong@vnet.ibm.com>
To: lfm@pgroup.com, hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on Asynch I/O
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
Hi,

I have the following questions on asynchronous I/O:

1  With HPF asynchronous I/O, it is only required that an asynchronous
   data transfer statement be eventually followed by a matching WAIT
   statement.  It is possible that the data transfer and the matching
   WAIT statements are in different procedures and some of the I/O list
   items may be out of scope at the time the WAIT is executed.
   Should the matching WAIT statement be only allowed in the same procedure
   scope of the asynchronous data transfer statement?

2  Does HPF require that the ERR= label and the IOSTAT= variable
   in an asynchronous data transfer statement be the same
   as the matching WAIT statement?

I've found a minor error on page 213, advice to users:

   READ(10,ID=5,REC=10)I,A(I)

   should read:

   READ(10,ID=idnum,REC=10)I,A(I)

   Reason: Only scalar-default-int-variable is allowed for ID=.

Thanks,

Wai Ming Wong
Compiler Development
IBM Software Solutions Toronto Laboratory
wmwong@vnet.ibm.com
Tel: (416) 448-3105
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Mon Sep 16 09:42:27 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id JAA11022 for hpff-doc-out; Mon, 16 Sep 1996 09:42:27 -0500 (CDT)
Received: from mail.gmd.de (mail.gmd.de [129.26.8.90]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id JAA11015 for <hpff-doc@cs.rice.edu>; Mon, 16 Sep 1996 09:42:20 -0500 (CDT)
Received: from fichte by mail.gmd.de with SMTP id AA11131
  (5.67b8/IDA-1.5 for <hpff-doc@cs.rice.edu>); Mon, 16 Sep 1996 16:41:53 +0200
Received: by fichte id AA01166
  (5.67b/IDA-1.5); Mon, 16 Sep 1996 16:41:22 +0200
Date: Mon, 16 Sep 1996 16:41:22 +0200
From: Thomas Brandes <Thomas.Brandes@gmd.de>
Message-Id: <199609161441.AA01166@fichte>
To: pm@icase.edu, offner@hpc.pko.dec.com, guy.steele@east.sun.com,
        hpff-doc@cs.rice.edu
Subject: hpff-doc: Comments on Mapping
Cc: Thomas.Brandes@gmd.de, Falk.Zimmermann@gmd.de
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Md5: 2TF0LVWhh0m4OtURt60X3Q==
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------

I would like clarification about the following problem:

      REAL, ALLOCATABLE :: A(:,:), B(:,:)
!HPF$ DISTRIBUTE A(BLOCK,BLOCK)
!HPF$ ALIGN B (:,:) WIT A(:,:)

      ALLOCATE (A(N,N))
      ALLOCATE (B(N,N))

P. 28 states that the ALIGN directive is equivalent to:

!HPF$ ALIGN B(I,J) WITH A(I-lbound(A,1)+lbound(B,1),     &
!HPF$                     J-lbound(A,2)+lbound(B,2))

with some attached requirements.

Now section 2.5 says:

'... ALIGN directives do not take effect immediately, however; they
 take effect each time the array is allocated by an ALLOCATE statement,
 rather than on entry to the scoping unit.'

Fine, this would imply that the program is a conforming.

But it says also:

'The vales of all specification expressions in such a directive are 
 determined once on entry to the scoping unit and may be used multiple
 times'

Now we have the problem with the lbound(...). If they are specification
expressions, they are determined on entry (and would be undefined). The
program would be nonconforming.

I think that this needs some clarification.

Thomas Brandes


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Mon Sep 16 13:29:14 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id NAA20066 for hpff-doc-out; Mon, 16 Sep 1996 13:29:14 -0500 (CDT)
Received: from [128.42.1.213] (morpheus.cs.rice.edu [128.42.1.213]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id NAA19957; Mon, 16 Sep 1996 13:27:24 -0500 (CDT)
X-Sender: chk@titan.cs.rice.edu
Message-Id: <v01540b0fae634b8d9bee@[128.42.1.213]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Mon, 16 Sep 1996 13:31:15 -0500
To: hpff-doc, hpff-core
From: chk@cs.rice.edu (Chuck Koelbel)
Subject: hpff-doc: New HPF 2.0 draft
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
The new, improved HPF 2.0.beta draft is now available by anonymous FTP from
titan.cs.rice.edu.  You can get the whole enchilada as
/public/HPFF/hpf2.0-draft/Releases/16-sep-96.tar.gz (gzipped tar file).  Or
you can get individual .tex files from /public/HPFF/hpf2.0-draft and its
subdirectories.  This will be the version distributed at the meeting.

Thanks to all the chapter editors who got their text in on time.  Shame on
those (like me) who didn't.

A couple notes:

* Chapter editors may want to retrieve a copy of their own chapters.  Small
but crucial changes were made:
        One missing "}" was added
          (Leaving it out sent half the document into typewriter font)
        All \newcommand macros moved into syntax-macs.tex

* Anybody trying to latex the document themselves needs syntax-macs.tex.
Note instructions inside (near the bottom) for converting between LaTeX
2.09 and LaTeX2e.

* See most of you in San Francisco Wednesday.

                                                Chuck


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Mon Sep 16 18:54:07 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id SAA07426 for hpff-doc-out; Mon, 16 Sep 1996 18:54:07 -0500 (CDT)
Received: from VNET.IBM.COM (vnet.ibm.com [199.171.26.4]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id SAA07421 for <hpff-doc@cs.rice.edu>; Mon, 16 Sep 1996 18:54:03 -0500 (CDT)
Received: from TOROLAB by VNET.IBM.COM (IBM VM SMTP V2R3) with BSMTP id 1305;
   Mon, 16 Sep 96 19:54:20 EDT
Received: by TOROLAB (XAGENTA 4.0) id 0992; Mon, 16 Sep 1996 19:53:17 -0400 
Received: by twinpeaks.torolab.ibm.com (AIX 3.2/UCB 5.64/4.03)
          id AA26871; Mon, 16 Sep 1996 19:53:32 -0400
From: <zongaro@vnet.ibm.com> (Henry Zongaro)
Message-Id: <9609162353.AA26871@twinpeaks.torolab.ibm.com>
Subject: hpff-doc: Asynchronous I/O
To: lfm@pgroup.com, hpff-doc@cs.rice.edu
Date: Mon, 16 Sep 1996 19:53:31 -0400 (EDT)
X-Mailer: ELM [version 2.4 PL24alpha3]
Content-Type: text
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
Hi Larry,

     I have the following comments on Asynchronous I/O.  These are based on
the August 17 draft - my apologies if these have changed or have been
corrected in the latest draft.

 - On p. 202, the text "If an ID= specifier appears, then . . . " appears
   twice - once as a constraint under H1101, and once just below.

 - On p. 202, the paragraph that begins "The addition of the ID= specifier",
   there are some typos "an WAIT" -> "a WAIT"; "specifing" -> "specifying"
   "an data transfer" -> "a data transfer".

 - The last paragraph from the bottom of p. 202, it's not clear from the
   description whether the transfer to the "ERR=" label is permitted to occur
   asynchronously.

 - On p. 203, the paragraph that begins "In the portion. . . " needs to be
   tightened up to prohibit references to entities that are associated with
   entities appearing in the data transfer statement from being referenced.
   For example,

          equivalence (i,j)
          read (99, id=id, rec=1) i
            i = 1         ! Currently prohibited
            j = 1         ! Not currently prohibited
          wait (99, id=id)

   Also, things like deallocating an allocatable array that appeared in the
   I/O list must be prohibited, changing the association of a pointer, etc.

 - On p. 203, the paragraph that begins "In the portion. . . " states that
   "no entity appearing in an expression anywhere in the input/output list
    may be assigned to or accessed."  This may be overly restrictive.  For
   example, it prohibits program fragments like the following, since "I" is
   referenced before the associated WAIT.

           do I = 1, 10
             read (99, id=ID(I), rec=I) A(:,I)
           end do
           ! do some other stuff
           do I = 1, 10
             wait (99, id=ID(I))
           end do

 - On p. 203, the last sentence of the paragraph that begins "Multiple
   outstanding. . . " states "If two WRITE statements which specify the same
   record number are executed, then the program is non-conforming."  This needs
   to indicate that this is only non-conforming if there is no intervening
   WAIT associated with the first WRITE.

 - On p. 203, the paragraph that begins "Note:  we still permit. . . "
   indicates that things like the following are legal:

        READ(10, ID=ID, REC=10), I, A(I)

   If we permit this, then things like this would be legal:

        READ(10, ID=ID, REC=10), I, (A(J), J = 1, I)

   which means that the number of entries (and their addresses) that need to be
   read isn't known up front.  It can be argued that the evaluation of the
   I/O list can be performed asynchronously, but then that runs into things
   like this:

        J = 1
        READ(10, ID=ID, REC=10), I, A(J+I)
        J = 100
        WAIT(10, ID=ID)

   This in turn leads to requiring the restriction that I argued against two
   comments ago.  I think we should avoid permitting references to variables
   that are being defined elsewhere in the same I/O list.

 - On p. 203, the constraint below H1103 can't be a constraint, since it can't
   be checked at compile-time.

 - On p. 203, the paragraph that begins "The DONE= specifier. . . ", the term
   scalar-default-logical-variable should be in italics.

Thanks,

Henry
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Tue Sep 17 07:41:22 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id HAA20073 for hpff-doc-out; Tue, 17 Sep 1996 07:41:22 -0500 (CDT)
Received: from VNET.IBM.COM (vnet.ibm.com [199.171.26.4]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id HAA20064 for <hpff-doc@cs.rice.edu>; Tue, 17 Sep 1996 07:41:18 -0500 (CDT)
Received: from TOROLAB by VNET.IBM.COM (IBM VM SMTP V2R3) with BSMTP id 3587;
   Tue, 17 Sep 96 08:41:35 EDT
Received: by TOROLAB (XAGENTA 4.0) id 1095; Tue, 17 Sep 1996 08:40:52 -0400 
Received: by twinpeaks.torolab.ibm.com (AIX 3.2/UCB 5.64/4.03)
          id AA23985; Tue, 17 Sep 1996 08:41:04 -0400
From: <zongaro@vnet.ibm.com> (Henry Zongaro)
Message-Id: <9609171241.AA23985@twinpeaks.torolab.ibm.com>
Subject: hpff-doc: Comments on Portable/Efficient Constructs
To: meltzer@cray.com, chk@cs.rice.edu, hpff-doc@cs.rice.edu
Date: Tue, 17 Sep 1996 08:41:02 -0400 (EDT)
X-Mailer: ELM [version 2.4 PL24alpha3]
Content-Type: text
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
Hello,

     I have the following comments on the "Portable and Efficient Constructs"
chapter.  My page references are with respect to the August 17 draft.  My
apologies if any of these have already been corrected:

 - In section 7.1.1, I think "(em i.e. *)" is a typo.

 - In section 7.3, there are certain requirements being placed on HPF
   compilers, whereas this chapter is supposed to describe the features which
   a user should use.  In addition, I'm not sure I entirely agree with the
   defaults being described.  In particular, I know of an implementation in
   which the default processors arrangement selected will not always be the
   same for identical shape distributees with identical mappings.  For example,

           program prog
             integer a(100, 100)
     !hpf$   processors p(number_of_processors()/2, 2)
     !hpf$   distribute a(block, block)
           end program prog

           subroutine sub
             integer a(100, 100)
     !hpf$   distribute a(block, block)
           end subroutine sub

   In prog, the array will be distributed onto p, so on 16 processors this
   would be an 8x2 processors arrangement, but in sub, a 4x4 arrangement will
   be selected.

 - In section 7.4, the precise meaning of "Alignments may not contain offsets
   or strides" needs to be spelled out.  Also, the way I read it, it means that
   the following is not allowed

          integer a(10), b(0:9)
    !hpf$ align a(i) with b(i-1)

   although the following equivalent alignment is allowed, since I've specified
   no explicit "offsets" or "strides".

          integer a(10), b(0:9)
    !hpf$ align a(:) with b(:)

 - In 7.4, the first paragraph states that "In alignment expressions the
   dimensions may not be permuted."  This is repeated as the third bullet in
   the list that follows.

 - In 7.4, the sentence that follows the list, the word "is" should be dropped.

 - In 7.8, further to the comment that begins "What does a naive programmer do
   with this advice?", I believe the intent of this chapter was to steer the
   user away from features which might perform poorly.  A DO loop that's
   specified to be INDEPENDENT, but can't be parallelized will perform about
   the same as the same DO loop without INDEPENDENT.  Telling them be careful
   with INDEPENDENT doesn't really give them any benefit.

 - In 7.10, it is stated that HPF_LOCAL and HPF_SERIAL "allow a program to get
   at the highest performing features of a particular architecture."  This
   might be true of HPF_LOCAL, but is probably not true of HPF_SERIAL.

 - In 7.11, the second sentence begins "The intrinsics. . . ."  I believe this
   should be "The HPF library procedures. . . ."  However, this brings up a
   second point - is use of all intrinsics encouraged?  What about the
   transformational intrinsics with non-constant DIM arguments?

 - 7.13.  Given the restrictions being put in place for pointers in HPF 2.0, I
   think this section can probably go away.

 - In the second paragraph of 7.14, it's stated that assumed shape mapped
   dummies may be used, that they may use any mapping syntax and that array
   valued function results may be explicitly mapped.  However, there is no
   indication as to whether these are restricted, recommended or not
   recommended.

 - In 7.15, the second sentence, "not suggested" should be "not recommended".

Thanks,

Henry
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Tue Sep 17 08:26:57 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id IAA20821 for hpff-doc-out; Tue, 17 Sep 1996 08:26:57 -0500 (CDT)
Received: from VNET.IBM.COM (vnet.ibm.com [199.171.26.4]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id IAA20805 for <hpff-doc@cs.rice.edu>; Tue, 17 Sep 1996 08:26:52 -0500 (CDT)
Received: from TOROLAB by VNET.IBM.COM (IBM VM SMTP V2R3) with BSMTP id 5494;
   Tue, 17 Sep 96 09:27:06 EDT
Received: by TOROLAB (XAGENTA 4.0) id 1158; Tue, 17 Sep 1996 09:25:17 -0400 
Received: by twinpeaks.torolab.ibm.com (AIX 3.2/UCB 5.64/4.03)
          id AA24380; Tue, 17 Sep 1996 09:25:26 -0400
From: <zongaro@vnet.ibm.com> (Henry Zongaro)
Message-Id: <9609171325.AA24380@twinpeaks.torolab.ibm.com>
Subject: hpff-doc: Comments on Extended Library
To: schreiber@hpl.hp.com, hpff-doc@cs.rice.edu
Date: Tue, 17 Sep 1996 09:25:24 -0400 (EDT)
X-Mailer: ELM [version 2.4 PL24alpha3]
Content-Type: text
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
Hi Rob,

     I have the following comments on Chapter 10.  They are with respect to the
August 17 draft.  My apologies if any of these have already been dealt with.

 - In the second paragraph of p. 189, it's not clear what is meant by a
   (*, block block) distribution.

 - In the fourth paragraph of p. 189, change "rank(array)" to "the rank of the
   array argument", or something like that.  The same comment applies twice
   under 10.1.3.

 - In the last paragraph of p. 189, "modified ON constructs" should be
   "modified by ON constructs".

 - In the first paragraph of p. 190, third sentence, change "its align target"
   to "its ultimate align target".

 - In 10.1.1, wasn't there a suggestion to rename ACTIVE_NUM_PROCS to
   NUM_ACTIVE_PROCS, to make it grammatically correct?

 - In 10.1.2, under "Result Value", there are two pairs of parentheses
   following "PROCESSORS_SHAPE".

Thanks,

Henry
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Sat Sep 21 19:50:56 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id TAA27538 for hpff-doc-out; Sat, 21 Sep 1996 19:50:56 -0500 (CDT)
Received: from coral.llnl.gov (coral.llnl.gov [134.9.1.2]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id TAA27532 for <hpff-doc@cs.rice.edu>; Sat, 21 Sep 1996 19:50:53 -0500 (CDT)
Received: (from zosel@localhost) by coral.llnl.gov (8.7.5/8.7.3/LLNL-Jun96) id RAA05747 for hpff-doc@cs.rice.edu; Sat, 21 Sep 1996 17:50:51 -0700 (PDT)
Date: Sat, 21 Sep 1996 17:50:51 -0700 (PDT)
From: Mary E Zosel <zosel@coral.llnl.gov>
Message-Id: <199609220050.RAA05747@coral.llnl.gov>
To: hpff-doc@cs.rice.edu
Subject: hpff-doc: HPF document release plan
Mime-Version: 1.0
Content-Type: text/plain; charset=X-roman8
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
HPF Document Revisions Schedule

Comments welcome anytime on any version.

Revised chapters in by Oct 3 ... (earlier submissions encouraged).

New draft Oct. 4 

Intensive review following week. Editors fix as quickly as possible.

Drafts due back evening Oct. 10 - Mary will test integration of the
document ... and forward to Rice where Ken and helpers will put out
final copy.

----------------------------------


New chapter order and responsible person

chapter 1 - overivew   Saday - Mary send changes.tex (done)
                       ... Piyush fix syntax

Chapter 2 - Distributions Piyush -  specialization updates ... 

Chapter 3 NEW NAME - INDEPENTENT AND RELATED DIRECTIVES - 
      rearrangement done by Mike - review - ROB - Piysh back-up

Chapter 4 - Mapping and Subroutines Carl ....

Chapter 5  Extrinsics .... - David - still under constructions - 

Chapter 6 - now library ... (can someone fix \?)

Part separator ...  piyush will review words

Chapter 7 - extended mappings --- Piyush (and Carl)

Chapter 8 -  ON / Task   ---- Chuck ....

Chapter 9  - Async I/O ... Henry responsible .... 

Chapter 10 -  extended library - Rob

Chapter 11 - Approved extrin  (Mary)

      - Mary - first part 
      - Henry - c-interop
      - Carol - F77 section

Annexes  

BNF - Guy - working on it 
Subset Mary ...    FIX TITLE  after input from Saday ....
Old ack - Mary ....
Bib - Chuck ... reference standards and committee documents
Policy - Mary
Craft - Mary will email  Larry / Jon .... 
F77 library  - Carol
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-doc  Mon Sep 30 22:50:49 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id WAA14606 for hpff-doc-out; Mon, 30 Sep 1996 22:50:49 -0500 (CDT)
Received: from moe.rice.edu (moe.rice.edu [128.42.5.4]) by cs.rice.edu (8.7.1/8.7.1) with ESMTP id WAA14594 for <hpff-doc@cs.rice.edu>; Mon, 30 Sep 1996 22:50:46 -0500 (CDT)
Received: from coral.llnl.gov (coral.llnl.gov [134.9.1.2]) by moe.rice.edu (8.7.1/8.7.1) with ESMTP id OAA17827 for <hpff-doc@cs.rice.edu>; Mon, 30 Sep 1996 14:17:15 -0500 (CDT)
Received: (from zosel@localhost) by coral.llnl.gov (8.7.5/8.7.3/LLNL-Jun96) id MAA15525; Mon, 30 Sep 1996 12:17:14 -0700 (PDT)
Date: Mon, 30 Sep 1996 12:17:14 -0700 (PDT)
From: Mary E Zosel <zosel@coral.llnl.gov>
Message-Id: <199609301917.MAA15525@coral.llnl.gov>
To: zongaro@vnet.ibm.com
Subject: hpff-doc: Comments on C-Interoperability section
Cc: hpff-doc@cs.rice.edu
Mime-Version: 1.0
Content-Type: text/plain; charset=X-roman8
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-doc
Precedence: bulk

---------------------------------------------------------------------------
hpff-doc@cs.rice.edu is a mailing list for HPF 2.0 language specification
authors and editors.  Instructions for adding or deleting yourself from this
list appear at the bottom of this message.
---------------------------------------------------------------------------
Henry
Here are some comments on the C interop section ...
They are based on the page numbers / line numbers I get in the
version of the hpf-local-ext that I just posted to hpff-doc.

page 12 section 2.4.1

Important:  the 3rd/4th constraints say that the bounds of the
dummy must be a constant - if I read it right --- this seems
exact opposite of the requirement that local-dummies be assumed
shape in Fortran/HPF ... why does this constraint exist?  ... does
it mean we can't call C-locals?  I'm I misunderstanding?

same section - all these constraints ... are these really constraints
in the formal sense.  If they are - we need to reference the specific
syntax rule they are constraining.  Otherwise these could be reworded
to just be a list of rules that an interface must follow.

line 23 next paragraph following contraints - last sentence - ... shouldn't
it be  "as if it were specified"  instead of "was"?  (subjunctive case).

section 2.4.2

line 41 first paragraph  of section 2.4.2
TYPE no longer exists ... but if you just replace it
with MAP_TO - then it is confusing with the MAP_TO in the first 
sentence.   Maybe the three references to  XYZ specifier in this
paragraph need to be changed to the syntactic name ... so that this
TYPE becomes  {\tt map-to-type-spec}

bottom page 12 and page 13

list of constraints ... again - are these really
constraints?  if so refer to the rule - otherwise list them just
as text or a set of numbered rules or ...

line 6 fourth constraint in this list ... layout-spec shall not be 
specified for an assumed-size array ... What is this ... is this
when the actual is assumed-size that a layout spec is forbidden?
If so put the phrase "actual argument" in the sentence.


line 25 paragraph following advice to users ... last sentence 
The HPF dummy arguement types ...  should this be HPF actual argument types?
        ^^^^^                                         ^^^^^^

line 27 next paragraph - parenthesized 
                           ^
line 36 fix the quotes around "C"
ditto line 46 - quotes around INT

page 14
line 2  - again funny quotes

page 15 lines 26-27   c_sub prints funny  - the underscore needs
some attention.
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-doc-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

