From owner-hpff-external  Fri Sep  8 09:24:19 1995
Received: by cs.rice.edu (JAA14601); Fri, 8 Sep 1995 09:24:19 -0500
Received: from [128.42.1.213] by cs.rice.edu (JAA14562); Fri, 8 Sep 1995 09:24:08 -0500
Date: Fri, 8 Sep 1995 09:24:08 -0500
X-Sender: chk@titan.cs.rice.edu
Message-Id: <v01530500ac75bbbad24e@[128.42.1.213]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
To: Daniele Rizzi <D.Rizzi@qmw.ac.uk>, hpff-external
From: chk@cs.rice.edu (Chuck Koelbel)
Subject: hpff-external: Re: A question about HPF_LOCAL (from comp.lang.fortran)
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

Hi, Danny -

I'm forwarding this to the hpff-external mailing list, which is considering
better interfaces between HPF and other parallel paradigms.  Sorry, I don't
have a worked example for you.  Just a few comments:

The primary goal of HPF was to support "Data parallel programming (defined
as single threaded, global name space, and loosely synchronous parallel
computation)" (taken from the language spec).  The master-slave model is
multi-threaded, at least if you're doing anything interesting with
controlling the task scheduling.  That's the primary reason that HPF
doesn't support it well.

Also, HPF goes to fair lengths to avoid having the user write the
communication explicitly.  That's why you're having trouble inserting
communication.

EXTRINSIC(HPF_LOCAL) supplies a bare-bones interface to an SPMD programming
paradigm.  Because of the variety of architectures that were considered,
concensus didn't seem possible on what communications features to use
(remember, this was designed before MPI).  So we wimped out and left that
to the vendors.  I'm suprised that DEC would recommend HPF_LOCAL without
telling you how to link their communications routines; maybe they're listed
in another section of the manual (or another manual).  I have no direct
experience with the compiler, though, so I can't say for sure

                  Chuck Koelbel

>Path:
>rice!newsfeed.rice.edu!bcm.tmc.edu!cs.utexas.edu!howland.reston.ans.net!tan
>k.news.pipex.net!pipex!sunsite.doc.ic.ac.uk!qmw!news.lpac.ac.uk!usenet
>From: Daniele Rizzi <D.Rizzi@qmw.ac.uk>
>Newsgroups: comp.lang.fortran
>Subject: A question about HPF_LOCAL
>Date: 7 Sep 1995 08:50:21 GMT
>Organization: lpac
>Lines: 78
>Message-ID: <42mbod$dko@gateway.lpac.ac.uk>
>NNTP-Posting-Host: neptune.lpac.ac.uk
>Mime-Version: 1.0
>Content-Type: text/plain; charset=us-ascii
>Content-Transfer-Encoding: 7bit
>X-Mailer: Mozilla 1.1N (X11; I; OSF1 V3.2 alpha)
>X-URL: news:comp.lang.fortran
>
>Hi all,
>
>I'm working on a 4-node Alpha Farm coupled with a HPF v1.0 compiler, and
>I'm supposed to write a parallel program using the Master-Slave model.
>Shortly described, such a program could be described (in a
>pseudo-parallel-language) by a loop:
>
>        PARDO i = 1, number_of_processors()
>                call foo(a, b, c, ...)
>        END PARDO
>
>which forks copies of himself through the available peers; the program
>running on the master and on the slaves is the same, but the instructions
>executed depend on the type of the node, and on the data it has to process.
>
>Unfortunately, it seems that this computer paradigm is not well-supported
>in HPF, because:
>
>        + a DO loop is never compiled as a parallel construct;
>
>        + and a FORALL loop is parallelized only if the included subroutine
>          (in this case, foo() ) is declared PURE, which in turn means
>          that it cannot:
>
>                - modify global variables;
>                - define variables with SAVE attribute;
>                - use dummy arguments without INTENT(IN) statement;
>                - call a function or a subroutine non-PURE;
>                etc.
>
>In a few words, there is no way to communicate or to share data between the
>different slaves, or between the master and the slaves.
>
>"HPF Handbook" and "DEC F90 Manual" suggest what they call "Loosely
>Synchronous Parallel Execution" model: "Although all processors execute the
>same program, the processors are not necessarily processing the exact same
>instruction at the same time".  This should be implemented using the
>EXTERNAL(HPF_LOCAL) declaration, but, oddly enough, no working example is
>provided, and there still remains open the question about the communication
>between the peers.  The DEC Manual states: "it is up to you to insert
>whatever SENDs and RECEIVEs are necessary", without giving any information
>about the syntax of SEND and RECEIVE calls.
>
>Is it available any example, or any fragment of code, describing how an
>explicit comm.  can be carried out inside HPF?
>
>Thanks for your help
>
>dany -> d.rizzi@lpac.ac.uk


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Fri Sep  8 14:43:18 1995
Received: by cs.rice.edu (OAA02191); Fri, 8 Sep 1995 14:43:18 -0500
Received: from timbuk.cray.com by cs.rice.edu (OAA02178); Fri, 8 Sep 1995 14:43:13 -0500
Received: from sdiv.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.6.12/CRI-gate-8-2.5) with SMTP id OAA03873 for <hpff-external@cs.rice.edu>; Fri, 8 Sep 1995 14:43:12 -0500
Received: from hickory304 by sdiv.cray.com (5.x/CRI-5.15.b.orgabbr Sdiv)
	id AA04810; Fri, 8 Sep 1995 14:43:10 -0500
From: meltzer@ironwood-fddi.cray.com (Andy Meltzer)
Received: by hickory304 (5.x/btd-b3)
          id AA06499; Fri, 8 Sep 1995 14:43:09 -0500
Message-Id: <9509081943.AA06499@hickory304>
Subject: hpff-external: HPF Kernel
To: hpff-external@cs.rice.edu
Date: Fri, 8 Sep 1995 14:43:09 -0500 (CDT)
X-Mailer: ELM [version 2.4 PL24-CRI-b]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------


Here is the latest draft of the HPF Kernel document.  It is in latex
format.


							Andy Meltzer

################

\documentstyle[twoside,11pt]{article}

\pagestyle{myheadings}
\sloppy
\footheight=-.60in
\headheight=0in
\textwidth=5.50in
\topmargin=-.20in
\oddsidemargin=0.5in
\evensidemargin=0.5in

\itemsep=.05in
\topsep=.05in
\parsep=.05in
\parskip=.10in
\textheight=9in

\newcounter{dofigures}  \setcounter{dofigures}{0}
\newcounter{dotables}        \setcounter{dotables}{0}
\newcommand{\havefigures}{\setcounter{dofigures}{1}}
\newcommand{\havetables} {\setcounter{dotables}{1}}

\newcommand{\cccdocid}[5]{
  \bibliographystyle{alpha}
% ******************** Title Page ********************
  \vspace*{.5in}
  \centerline{\Large \bf {#1}}
  \vspace*{.02in}
  \centerline{{#2}}
  \vspace*{.02in}
  \centerline{{#3}}
  \vspace*{.02in}
  \centerline{{#4}}
  \vspace*{.02in}
  \centerline{{#5}}
  \vspace*{.5in}

  % ******************** List of Figures ********************
  % Note: Remove the following line if there are no figures
%  \ifodd\value{dofigures} \listoffigures \newpage \fi
%  \ifodd\value{dotables}  \listoftables  \newpage \fi
%  \markboth{#1} {#3}
%  \pagenumbering{arabic}
  }

% ******************** Special macros ********************
%\catcode`^^Z=9                % Make TeX ignore IBMPC EOF character

\newcommand{\ital}[1]{{\it #1}}
\newcommand{\type}[1]{{\tt #1}}
\newcommand{\bold}[1]{{\bf #1}}
\newcommand{\dfn}[1]{{\it #1}\index{#1}}
\newcommand{\comment}[1]{}

\def\fineprint{\relax}
\def\stopfine{\relax}


\newcommand{\beginprog}{\begin{tabbing}
xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=\kill
}
\newcommand{\tb}{\>\tt}
\newcommand{\finprog}{\end{tabbing}}
\newcommand{\progtxt}[1]{``{\tt #1}''}
\newcounter{save_enum}
\newcommand{\hilite}[1]{ \item {\bf #1\\} }
\newcommand{\morelater}{\centerline{$\ldots$\ital{Fill this in later}$\ldots$}}


% ******************** Start of Document ********************
\begin{document}

\cccdocid
  {Kernel HPF}
  {Ver 2.0}
  {Andrew Meltzer}
  {Cray Research, Inc.}
  {August, 1995}

\begin{abstract}

HPF contains many directives which offer ease of use, but
the excessive overhead associated with these directives can severely
compromise performance on one or more (even all for some combination
of features) distributed memory parallel computers.
This report identifies a kernel of HPF which performs well across many
platforms.  It is an implementors view and may not contain all of the
features all applications developers desire, but all of the features
which are provided enhance performance.

\end{abstract}

\section{Introduction}

HPF contains many directives which offer ease of use, but
the excessive overhead associated with these directives can severely
compromise performance on one or more (even all for some combination
of features) distributed memory parallel computers.
Kernel HPF is a high performance subset of HPF.  The features chosen
for the Kernel HPF subset were selected for the following reasons
\begin{itemize}
\item They allow high-performance across all platforms.
\item On no platform are there performance ``surprises,''  Few, if any, 
      combinations
      of Kernel HPF features unexpectedly lead the user into low-performance 
      circumstances although it is still possible for application
      programmers to write low-performance codes and for vendors to 
      provide low-performance implementations.
\item They are easily understood.
\item They are commonly used.
\item They are valuable for all platforms.
\end{itemize}

Kernel HPF limits the directives from HPF
to a core subset that can be shown to be high performance across a broad 
spectrum of machines.   

Items are taken from throughout the entire HPF specification, Kernel HPF 
does not limit itself to selections from Subset HPF.

The set of features chosen for Kernel HPF are not a randomly
selected set of the available features.  They are chosen based on
the above constraints.  They are also chosen to
simplify the HPF model by reducing the variety of mappings to
only those most commonly used and to simplify
the model by removing a level of abstraction.

The first part of this paper describes each feature and its Kernel HPF
definition along with a rationale.
Part two of this paper steps through the HPF specification, describing 
the changes to the features of each section.

With the proposed changes to full HPF that allow all remappings to be
done by the caller, an extrinsic environment is no longer necessary for
the kernel because the calling conventions will be the same.  It is
recommended that HPF\_KERNEL subroutines be distinguished by a 
\begin{verbatim}
        CHPF$ 	HPF_KERNEL
\end{verbatim}
directive.  This directive indicates that the subroutine which follows
is a Kernel HPF subroutine.


\section{Kernel HPF Features}

This section describes each of the features of Kernel HPF in detail
and gives a rationale for the inclusion or exclusion of each. 

\subsection{The \type {DISTRIBUTE} Directive}

The {\type {DISTRIBUTE}} directive remains mostly intact, only a few pieces
of functionality have been removed.
\subsubsection{Distribution Formats}

Kernel HPF includes only the {\type {BLOCK}} distribution.  It does
not include the {\type {CYCLIC}}, {\type {CYCLIC(N)}}, or {\type {BLOCK(N)}}
distributions.  The effort to determine the locations of the elements
in an array with the {\type {BLOCK}} distribution is much less than the
other distributions.

%Kernel HPF includes {\type {BLOCK}}, {\type {CYCLIC}}, and 
%{\type {CYCLIC(N)}} distributions.  It does not include {\type {BLOCK(N)}}
%distributions.

%{\em{Rationale}}:
%\begin{itemize}
%\item 	{\type {BLOCK}} is the most common and useful distribution in common
%	practice.  It is also the most straightforward to implement efficiently.
%\item	{\type {CYCLIC}} is the next most common distribution.  It allows
%	maximal load balancing.
%\item	{\type {CYCLIC(N)}} allows the user to tune the 
%	communicaton/load-balance ratio for particular applications and 
%	particular architectures.  It is very important when fine-tuning
%	for performance.
%\item	{\type {BLOCK(N)}} is difficult to implement efficiently  and 
%	it is unclear how a high-performance application could effectively
%	exploit this feature.
%\end{itemize}

There is some sentiment that the kernel should contain more distribution
types, for example that the {\type {CYCLIC(N)}} distribution be added.

\subsection{\type {DYNAMIC} Distributions}

{\type The {DYNAMIC}} attribute is disallowed in Kernel HPF.

\subsubsection{On-Processor Dimensions}
The ``*'' syntax is included in the {\type {DISTRIBUTE}} directive 
{\em {dist-format-clause}} to indicate on-processor dimensions.

{\em {Rationale}}: This is a convenient technique for increasing locality of
reference, a goal for most high performance codes.  It can be accomplished
efficiently with reasonable compiler effort.


\subsection{The \type {PROCESSORS} Directive}

The {\type {PROCESSORS}} directive is part of Kernel HPF without change.

{\em {Rationale}}: The processors directive is an important way to
indicate the equivalence of mappings.  While most implementations
will probably map arrays with identical distributions and alignments
the same, portability requires that the {\type {PROCESSORS}} directive
be made available.  

The directive is also perhaps the only way to fine-tune multidimensional
distributions.


\subsection{The \type {ALIGN} Directive}

In Kernel HPF the {\type {ALIGN}} directive is restricted to
alignments that are direct;  there may be no offsets or strides
within an  {\type {ALIGN}}.  Replication and collapsing of dimensions
are also disallowed.  In alignment expressions the dimensions may
not be permuted, for example:

\begin{verbatim}

        !HPF$ ALIGN A(I,J) WITH B(J,I)  ! Not allowed in Kernel

\end{verbatim}

{\em {Rationale}}: While these measures initially appear to be 
draconian restrictions on programmability, almost all of the useful
functionality provided by these features is available through other
means which are more straightforward to optimize.  Adding strides to the 
{\type {ALIGN}} directive, among other things, greatly complicates 
the addressing, probably ends up wasting space, and increases the 
amount of information which must
be passed across subroutine boundaries. 

If the {\type {CYCLIC(N)}} distribution is accepted as part of the
kernel, almost everything that one wants to do with strided alignments 
can be done by varying the block size.  In both of the following
examples the elements of {\type A} are aligned with the even elements
of {\type B}.

    {\em {Example 1a}}
\begin{verbatim}
                DIMENSION A(128), B(256)
        CHPF$ 	DISTRIBUTE B(CYCLIC(16))
        CHPF$ 	ALIGN A(I) WITH B(2*I)	! Disallowed in Kernel
\end{verbatim}

    {\em {Example 1b}}
\begin{verbatim}
                DIMENSION A(128), B(256)
        CHPF$ 	PROCESSORS P(32)
        CHPF$ 	DISTRIBUTE A(CYCLIC(8)) ONTO P
        CHPF$ 	DISTRIBUTE B(CYCLIC(16)) ONTO P
\end{verbatim}

It has been noted however, that the vast majority of codes do not require
this functionality so its lack of inclusion into the HPF kernel would not
be overly restrictive.

Offsets are not allowed in the {\type {ALIGN}} directive of Kernel HPF.  The 
effect of offsets can be
generally be achieved by changing the bounds of arrays (although this
may be implementation dependent, it is not reaching all that far.)
The following two examples will create the same alignment.

    {\em {Example 2a}}
\begin{verbatim}
                DIMENSION A(1:100), B(0:99)
        CHPF$ 	PROCESSORS P(32)
        CHPF$ 	DISTRIBUTE B(BLOCK) ONTO P
        CHPF$ 	ALIGN A(I) WITH B(I-1)	! Disallowed in Kernel
\end{verbatim}

    {\em {Example 2b}}
\begin{verbatim}
		DIMENSION A(1:100), B(0:99)
	CHPF$ 	PROCESSORS P(32)
	CHPF$ 	DISTRIBUTE A(BLOCK) ONTO P
	CHPF$ 	DISTRIBUTE B(BLOCK) ONTO P
\end{verbatim}

Collapsing of dimensions is disallowed because it can be accomplished 
with the {\type {DISTRIBUTE}} directive.  

Replication is not available with these restrictions, but Kernel HPF
is not meant to solve all problems.  Unfortunately to achieve the
goal of adding replication, a whole lot of unwanted baggage would
also be required.   It may be possible to add some syntax to explicitly
recommend replication but it should not be done through the {\type {ALIGN}}
directive.

\subsection{The {\type {TEMPLATE}} Directive}

The template in a {\type {TEMPLATE}} directive must have the same rank 
and extents as the arrays aligned to it.  This, in 
essence, reduces the the functionality of templates to a convenient
way of specifying that different (identically dimensioned) arrays are 
to be distributed identically.  

{\em {Rationale}}: Mis-use of templates is probably the easiest way
to turn a high performance code into a low performance one, and it
is likely that different compilers will optimize different parts
of the complexities of templates due to underlying hardware performance
considerations. Even when a user finds a complicated template layout
to be high performance on one platform it will probably not be high
performance, or even moderately high performance, on all platforms.

Also, nearly all of the high-performance features of templates
are available through other mechanisms.


\subsection{The {\type {FORALL}} Statement and Construct}

These are included in Kernel HPF.

{\em {Rationale}}: {\type {FORALL}} is a powerful and expressive way
to code parallelism.  It is true that not all {\type {FORALL}} 
statements and/or constructs can be fully parallelized, but because
of their general value and because they will be part of Fortran 95,
they are included in Kernel HPF.

\subsection{{\type {PURE}} Functions and Subroutines}

These are included in Kernel HPF.

{\em {Rationale}}: {\type {PURE}} functions can do no harm, and allow
an increased level of expressiveness.  They simplify the job for
the compiler and are part of Fortran 95.

\subsection{The {\type {INDEPENDENT}} Directive}

The {\type {INDEPENDENT}} directive is included in Kernel HPF.

{\em {Rationale}}: Directives such as {\type {INDEPENDENT}} have shown
their worth historically across a broad variety of platforms.  This
directive simplifies the analysis the compiler must do and can inform
the compiler about optimizations that it may do that otherwise
are expensive or impossible to deduce.   {\type {INDEPENDENT}} allows 
an increased level
of optimization.

\subsection{Extrinsics and {\type {HPF\_LOCAL}} and {\type {HPF\_SERIAL}} }

The general purpose extrinsic mechanism, {\type {HPF\_LOCAL}} and 
{\type {HPF\_SERIAL}} are all included in Kernel HPF.

{\em {Rationale}}: These allow a program to get at the highest 
performing features of a particular architecture.


\subsection{Library}

All of the HPF library routines are in Kernel HPF.  

{\em {Rationale}}: The intrinsics provide fast, architecture specific
versions of many very important parallel functions.  To achieve a
similar level of performance for these functions a user would have
to write non-portable code.


\subsection{The {\type {INHERIT}} Directive}

The  {\type {INHERIT}} directive is not part of Kernel HPF.

{\em {Rationale}}:  The usefulness of {\type {INHERIT}} is greatly
limited by the rest of the restrictions placed on Kernel HPF.  It
still has some use when passing, for example, array sections with
strides but the additional mechanisms that would have to be added
elsewhere to handle this case do not justify its inclusion.

\subsection{Subroutine Interfaces}

In Kernel HPF explicit interfaces are required for any subroutine
which remaps data.
Dummy arguments from actual argument array sections cannot be 
mapped ``{\type {DISTRIBUTE * ONTO *}}.'' Non-unit
strides may be specified when passing array sections as actual
arguments.

Assumed shape explicitly mapped dummies are
allowed in Kernel HPF, but explicit interfaces must be included where
they are used. Assumed shape dummies may use any mapping syntax.
The results of array valued functions may be explicitly mapped.  Assumed
size arrays are not allowed in the Kernel.

To be exact, an explicit interface is required in each of the following
cases:
\begin{itemize}
\item A parameter is passed transcriptively or with the inherit attribute.
\item The mapping of the dummy argument is not the same as the mapping of
      the actual argument.
\end{itemize}


{\em {Rationale}}: Much better control of the remapping can be achieved
if the explicit interface is available and much less code to check
mappings and remap upon entry is necessary.


\section{HPF Specification by Section}
\subsection{Sections 1 and 2. Overview and Terms and Concepts}

No changes are needed here for Kernel HPF.

\subsection{Section 3. Data Alignment and Distribution Directives}

This section contains much of the substance of the HPF specification:
the data alignment and distribution directives.  Kernel HPF severely
limits these directives to only those where high
performance across a broad spectrum of machines can be expected.

The most restrictions are placed on alignment and templates.  Alignments
cannot be permuted, nor can they be offset.  Templates must be identical
in rank and size to the arrays aligned to them.

\subsection{Section 3.1 Model}

The Kernel HPF model is a
small subset of the HPF model; this subset is so restricted that a
simpler semantic model could have been used. On page 21 of the HPF
specification the model is graphically displayed as a four-level mapping
from ``arrays or other objects'' to ``group of aligned objects'' to ``abstract
processors as a user-declared cartesian mesh'' and finally to ``physical
processors''  In Kernel HPF this model is simplified to three levels by
severly restricting the usefulness of templates.


\subsection{Section 3.2. Syntax of Data Alignment and Distribution Directives}

None of the keywords {\type {REALIGN}}, {\type {REDISTRIBUTE}, 
{\type {INHERIT}, or {\type {DYNAMIC}} are
part of Kernel HPF. With this exception,
section 3.2 correctly describes the syntax of the data alignment and
distribution directives of Kernel HPF.

\subsection{Section 3.3 {\type {DISTRIBUTE}} and {\type {REDISTRIBUTE}} Directives}

\type {REDISTRIBUTE} and {\type {DYNAMIC}} are not part of Kernel HPF.

\type {BLOCK} is available in Kernel HPF. 

Collapsing of dimensions can only be
achieved through distribution, not alignment.  See the following
section for details of the restriction.  The `*' syntax
is recognized by Kernel HPF and indicates that the 
dimension is on-processor.

\subsection{Section 3.4 {\type {ALIGN}} and {\type {REALIGN}}}

{\type {REALIGN}} is not part of in Kernel HPF
The {\type {ALIGN}} directive itself is very
restricted.  The following are additional rules placed on the {\type {ALIGN}}
directive

\begin{itemize}
\item	Alignments must be direct, i.e., there can be no offsets
	specified.      For example the following is correct code:

		{\type {CHPF\$           ALIGN A(I,J) WITH B(I,J)}}

	 But the following is not:

		{\type {CHPF\$           ALIGN A(I,J) WITH B(I+1,J)}}

\item	The {\em {align-subscript-use}} in the HPF syntax rules should be
	replaced with {\em {align-dummy}}.

\item	Dimensions may not be permuted.

\item	The {\em {alignee}} and the {\em {align-target}} of 
        the {\type {ALIGN}} directive must have the same rank and extents.

\item	A `*' is not allowed in either the align-spec or the alignee.
\end{itemize}

\subsection{Section 3.5. {\type {DYNAMIC}} Directive}

The {\type {DYNAMIC}} Directive is not part of Kernel HPF.

\subsection{Section 3.6. Allocatable Arrays and Pointers}

Allocatable arrays and pointers in Kernel HPF may be explicitly mapped.

\subsection{Section 3.7. {\type {PROCESSORS}} Directive}

The {\type {PROCESSORS}} directive is allowed.

\subsection{Section 3.8. {\type {TEMPLATE}} Directive}

Templates may only be the shape (rank and extent for each dimension) of
the arrays aligned to them. This restriction, in conjunction with the
restrictions on alignment make moot almost all of the examples in
Section 3.8. The restrictions placed on templates are so severe, in
fact, that they reduce the usefulness of templates to a syntactically
convenient way of mapping many identical arrays in a similar fashion.

\subsection{Section 3.9. {\type {INHERIT}} Directive}

The {\type {INHERIT}} directive is not part of Kernel HPF. 

\subsection{Section 3.10 Alignment, Distribution, and Subprogram Interfaces}

Kernel HPF requires interface blocks for any arguments that are re-mapped.
This allows the caller to re-map data only when necessary, greatly 
simplifying (and consequently speeding up) the subroutine call interface.

Arbitrary array sections may be passed, but must have an explicit
interface and cannot be mapped ``{\type {DISTRIBUTE * ONTO *.}}'' Non-unit
strides may be specified when passing array sections as actual
arguments.

Kernel HPF retains the distinctions and all of the syntax relating to
transcriptive, prescriptive and descriptive mappings in full HPF.

The results of array valued functions may be explicitly mapped.

\subsection{Section 4. Data Parallel Statements and Directives}

\subsection{Section 4.1 and 4.2. The {\type {FORALL}} Statement and Construct}

The {\type {FORALL}} statement and the {\type {FORALL}} construct are part
of Kernel HPF.

\subsection{Section 4.3. Pure Procedures}

Pure procedures and functions are part of Kernel HPF.  


\subsection{Section 4.4. The {\type {INDEPENDENT}} Directive}

The {\type {INDEPENDENT}} directive is part of Kernel HPF.  The {\type {NEW}}
clause is also part of Kernel HPF. 

\subsection{Section 5. Intrinsic and Library Procedures}

Kernel HPF should contain all intrinsic and library procedures.  

\subsection{Section 7. Storage and Sequence Association}

There is no storage or sequence association in Kernel HPF.

\end{document}


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Fri Sep  8 15:00:03 1995
Received: by cs.rice.edu (PAA02923); Fri, 8 Sep 1995 15:00:03 -0500
Received: from timbuk.cray.com by cs.rice.edu (OAA02910); Fri, 8 Sep 1995 14:59:55 -0500
Received: from sdiv.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.6.12/CRI-gate-8-2.5) with SMTP id OAA06137; Fri, 8 Sep 1995 14:59:50 -0500
Received: from hickory304 by sdiv.cray.com (5.x/CRI-5.15.b.orgabbr Sdiv)
	id AA07766; Fri, 8 Sep 1995 14:59:48 -0500
From: meltzer@ironwood-fddi.cray.com (Andy Meltzer)
Received: by hickory304 (5.x/btd-b3)
          id AA06512; Fri, 8 Sep 1995 14:59:46 -0500
Message-Id: <9509081959.AA06512@hickory304>
Subject: hpff-external: Re: Kernel HPF
To: chk@cs.rice.edu (Chuck Koelbel)
Date: Fri, 8 Sep 1995 14:59:46 -0500 (CDT)
Cc: hpff-external@cs.rice.edu
In-Reply-To: <v01530511ac75f318d44a@[128.42.1.213]> from "Chuck Koelbel" at Sep 8, 95 02:09:38 pm
X-Mailer: ELM [version 2.4 PL24-CRI-b]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------


Chuck -

Thanks for the timely replies...

> 
> Random comments follow.
> 
> >\begin{abstract}
> >
> >HPF contains many directives which offer ease of use, but
> >the excessive overhead associated with these directives can severely
> >compromise performance on one or more (even all for some combination
> >of features) distributed memory parallel computers.
> >This report identifies a kernel of HPF which performs well across many
> >platforms.  It is an implementors view and may not contain all of the
> >features all applications developers desire, but all of the features
> >which are provided enhance performance.
> >
> >\end{abstract}
> 
> Something that I've never been sure about:
> Is the Kernel HPF implementation faster because only efficient features are
> included, or because the compiler knows it doesn't need to handle the
> general case?
> 
> I'm not sure how to make the distinction clear.  Basically, if the
> efficiency comes from "knowing" that the worst case won't happen, then
> calling Kernel HPF from Full HPF (or vice versa) raises questions about
> which assumptions (calling conventions) to use.  If efficiency comes from
> avoiding expensive features, then can't/won't the compiler optimize them
> away without Kernel HPF?

With the likely changes to the rules about requiring interface blocks,
I believe that an implementation can, without difficulty use the same
calling sequence for the kernel that it uses for full hpf.  That is why
I no longer am calling it an extrinsic, it simply uses a 
	
	!HPF$ HPF_KERNEL

directive to distinguish it from the rest of the code.  I believe that
it is useful to distinguish it because the compiler can make quite a 
few simplifying assumptions if it is known to be kernel code.


> 
> >\subsection{The \type {DISTRIBUTE} Directive}
> >
> >The {\type {DISTRIBUTE}} directive remains mostly intact, only a few pieces
> >of functionality have been removed.
> >\subsubsection{Distribution Formats}
> >
> >Kernel HPF includes only the {\type {BLOCK}} distribution.  It does
> >not include the {\type {CYCLIC}}, {\type {CYCLIC(N)}}, or {\type {BLOCK(N)}}
> >distributions.  The effort to determine the locations of the elements
> >in an array with the {\type {BLOCK}} distribution is much less than the
> >other distributions.
> 
> Uh, going from countably infinite distribution patterns (i.e. any value of
> N) to one is kind of a big reduction, isn't it? :-)
> 
> I'd favor having CYCLIC distribution as well, just to satisfy people doing
> Gaussian Elimination.  Having only one pattern seems kind of useless.
> (Although there is a choice of which dimension(s) get distributed.)
> 

This area is one of much debate.  I have changed the section in the document
a few times and if you look at the tex file itself you'll notice that 
there is a section commented out that allows all of the distributions 
except BLOCK(N).  I don't know what answer is best, but the kernel can
survive pretty well whatever the decision of the group.


> >\subsection{The \type {ALIGN} Directive}
> >
> >In Kernel HPF the {\type {ALIGN}} directive is restricted to
> >alignments that are direct;  there may be no offsets or strides
> >within an  {\type {ALIGN}}.  Replication and collapsing of dimensions
> >are also disallowed.
> 
> I'd favor collapsing dimensions, unless you have an example of how it makes
> addressing complex.
> 
>         !HPF$ ALIGN A(I,J) WITH B(I)
> 
> 
> >Collapsing of dimensions is disallowed because it can be accomplished
> >with the {\type {DISTRIBUTE}} directive.
> 
> But maintaining this correspondence when the DISTRIBUTE changes may not be easy.


I thought for a while about the problem of collapsing dimensions and the
ALIGN directive.  It is syntactically harder to concoct a kernel and much
harder to understand what we have created if ALIGN has this kind of limited 
functionality.  By stripping ALIGN of all of its power the kernel becomes
much more straightforward with an admitted loss of a feature that is
useful in many circumstances.  I believe that the simplicity of the 
result outweighs the loss of the functionality.


> 
> >\subsection{The {\type {TEMPLATE}} Directive}
> 
> If ALIGN is so restricted, it's sort of hard to imagine a need for
> TEMPLATE.  Just pick one of the arrays you were going to match against the
> template.

Good point and I agree.  TEMPLATE in the kernel has only one use: it is
a convenient placeholder for a distribution.  You can put the templates
into an include file and they are a convenient way to play with all of 
the distributions in a program.  Sort of like a distribution typedef.
I think this is part of the beauty of the simplification achieved above.

> 
> >\subsection{The {\type {FORALL}} Statement and Construct}
> >
> >These are included in Kernel HPF.
> >
> 
> I'm suprised (although not angry :-) that you don't want to restrict the
> expensive forms of FORALL.  For example, I'd be willing to disallow calling
> user functions from FORALL.  (Presumably PURE intrinsics are more-or-less
> straigh-forward.)

This is an area that can easily be mucked with in the kernel while it still
remains basically intact.  If you have a coherent set of changes you'd
like to propose, the subgroup can package them up in a way that we can
vote on them.  I am not averse to restricting the behavior of FORALL.  The
only argument I see against it is that it requires more for the user to
understand.

> 
> >\subsection{{\type {PURE}} Functions and Subroutines}
> >
> >These are included in Kernel HPF.
> >
> >{\em {Rationale}}: {\type {PURE}} functions can do no harm, and allow
> >an increased level of expressiveness.  They simplify the job for
> >the compiler and are part of Fortran 95.
> 
> 
> The key problem is that PURE functions can access global distributed data.
> As previous HPFF committee debates have shown, this is very hard to do on
> distributed memory machines.  If the purpose of Kernel HPF is "no
> performance suprises", this certainly seems to argue for restricting PURE.
> 
> Suggestion: Add constraints to PURE in Kernel HPF so that a PURE function
> cannot access global distributed data.  (Accessing global replicated data
> would seem safe, except Kernel HPF can't replicate data.)
> 

Again, as above, I am not necessarily against this, but it does make the
kernel less easy to immediately grasp.


> > Non-unit
> >strides may be specified when passing array sections as actual
> >arguments.
> 
> Why is this allowed, when strides in ALIGN are not?  After all, it causes
> the same addressing problems.

With the new rules in HPF, an implementation can use copy-in copy-out
semantics if desired.

>                                                 Chuck
> 
> 
> 


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Mon Sep 11 15:04:14 1995
Received: by cs.rice.edu (PAA02228); Mon, 11 Sep 1995 15:04:14 -0500
Received: from mailhost.lanl.gov by cs.rice.edu (PAA02219); Mon, 11 Sep 1995 15:04:06 -0500
Received: from wrangler.lanl.gov by mailhost.lanl.gov (8.6.12/1.2)
	id OAA21440; Mon, 11 Sep 1995 14:03:50 -0600
Received: from quantum.lanl.gov (quantum.lanl.gov [128.165.113.199]) by wrangler.lanl.gov (8.6.12/8.6.12) with SMTP id OAA13079 for <hpff-external@cs.rice.edu>; Mon, 11 Sep 1995 14:03:50 -0600
Date: Mon, 11 Sep 1995 14:03:50 -0600
From: Jeffery S Brown <jxyb@lanl.gov>
Message-Id: <199509112003.OAA13079@wrangler.lanl.gov>
To: hpff-external@cs.rice.edu
Subject: hpff-external: hpff meeting
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

I plan to attend the HPFF meeting next week as the LANL rep.

I am interested in the external interfaces work.  We are currently
working with the Portland Group and NASA Ames to extend the P2D2
debugger to provide source level HPF debugging support.  I'm
interested in doing this in a "standard" way.

Are you looking at debugger interfaces?

How can I get "plugged in" to your efforts?

We are also doing a general evaluation of various vendor HPF
implementaions of HPF and comparing them with other methods
on the T3D such as CRAFT and F90/shmem using a "real code"
as the evaluation vehicle (a 3d hydrodynamics code written
originally for the CM5).

thanks,

	Jeff Brown, Los Alamos National Laboratory
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Tue Sep 12 09:16:45 1995
Received: by cs.rice.edu (JAA03663); Tue, 12 Sep 1995 09:16:45 -0500
Received: from [128.42.5.152] by cs.rice.edu (JAA03656); Tue, 12 Sep 1995 09:16:35 -0500
Date: Tue, 12 Sep 1995 09:16:35 -0500
X-Sender: chk@titan.cs.rice.edu
Message-Id: <v01530501ac7afca8eef2@[128.42.5.152]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
To: Doug MacDonald <macdon@think.com>
From: chk@cs.rice.edu (Chuck Koelbel)
Subject: hpff-external: Re: hpff: July HPFF Meeting Minutes
Cc: zosel@phoenix.ocf.llnl.gov, hpff-external@cs.rice.edu
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

At 18:16 09/11/95, Doug MacDonald wrote:
>...
>Hello,
>
>The portion of the July meeting minutes touches on a topic of interest to
>me: the object and data mapping descriptors used to provide interfaces
>between HPF routines and subroutines written in a "scalar" language such as
>C.  For instance, given an array A with a given alignment or distribution,
>a HPF routine might pass A to a C subroutine in the form of a the address
>of a predefined descriptor object that contains information about the
>shape, type, layout, etc., of A.  Is standardization of such descriptor
>sets a part of subgroup E's activities?  If not, is such an effort underway
>elsewhere?
>
>Thanks,
>Doug MacDonald
>Thinking Machines

I'm redirecting this question to the hpff-external mailing list, which
handles Group E's discussions.

You can probably get a more definitive answer from David Loveman, who
chairs the subgroup.  My take is that this is within the scope of Group E,
but has not been pursued yet.  In part, this may be because there has been
no concrete proposal.  Another reason for the delay may be that the format
and processing of array descriptors is likely to be on the critical
performance path for HPF implementations.  This will tend to make concensus
more difficult, unless someone proves beyond a reasonable doubt that some
particular format is the most efficient possible.  (What I'm saying is,
vendors don't want to share the heart of their runtime system
implementation.  If some academic project blows the commercial compilers
out of the water, there will be more leverage.)

Just my two cents...

                                                Chuck Koelbel


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Tue Sep 12 09:59:42 1995
Received: by cs.rice.edu (JAA05350); Tue, 12 Sep 1995 09:59:42 -0500
Received: from mail.think.com by cs.rice.edu (JAA05336); Tue, 12 Sep 1995 09:59:35 -0500
Received: from Delphi.Think.COM by mail.think.com; Tue, 12 Sep 95 10:59:20 -0400
Received: by delphi.think.com (4.1/Think-1.2)
	id AA15910; Tue, 12 Sep 95 10:59:19 EDT
Date: Tue, 12 Sep 95 10:59:19 EDT
Message-Id: <9509121459.AA15910@delphi.think.com>
From: Doug MacDonald <macdon@think.com>
To: chk@cs.rice.edu
Cc: zosel@phoenix.ocf.llnl.gov, hpff-external@cs.rice.edu
In-Reply-To: Chuck Koelbel's message of Tue, 12 Sep 1995 09:16:35 -0500 <v01530501ac7afca8eef2@[128.42.5.152]>
Subject: hpff-external: hpff: July HPFF Meeting Minutes
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

   Date: Tue, 12 Sep 1995 09:16:35 -0500
   From: chk@cs.rice.edu (Chuck Koelbel)

   At 18:16 09/11/95, Doug MacDonald wrote:
   >...
   >Hello,
   >
   >The portion of the July meeting minutes touches on a topic of interest to
   >me: the object and data mapping descriptors used to provide interfaces
   >between HPF routines and subroutines written in a "scalar" language such as
   >C.  For instance, given an array A with a given alignment or distribution,
   >a HPF routine might pass A to a C subroutine in the form of a the address
   >of a predefined descriptor object that contains information about the
   >shape, type, layout, etc., of A.  Is standardization of such descriptor
   >sets a part of subgroup E's activities?  If not, is such an effort underway
   >elsewhere?
   >
   >Thanks,
   >Doug MacDonald
   >Thinking Machines

   I'm redirecting this question to the hpff-external mailing list, which
   handles Group E's discussions.

OK, sorry.  I'll put any further discussion there.

   You can probably get a more definitive answer from David Loveman, who
   chairs the subgroup.  My take is that this is within the scope of Group E,
   but has not been pursued yet.  In part, this may be because there has been
   no concrete proposal.  Another reason for the delay may be that the format
   and processing of array descriptors is likely to be on the critical
   performance path for HPF implementations.  This will tend to make concensus
   more difficult, unless someone proves beyond a reasonable doubt that some
   particular format is the most efficient possible.  (What I'm saying is,
   vendors don't want to share the heart of their runtime system
   implementation.  If some academic project blows the commercial compilers
   out of the water, there will be more leverage.)

   Just my two cents...

						   Chuck Koelbel

Thanks for giving me your take on this, Chuck.  I knew Carol Munroe of TMC
is on the standards committee, but not (until a few minutes ago) that she
is also on a group looking at these very issues.  I'll discuss it with her.

Doug
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Wed Sep 13 03:45:54 1995
Received: by cs.rice.edu (DAA10534); Wed, 13 Sep 1995 03:45:54 -0500
Received: from epcc.ed.ac.uk by cs.rice.edu (DAA10523); Wed, 13 Sep 1995 03:45:43 -0500
Date: Wed, 13 Sep 95 09:45:06 BST
Message-Id: <837.9509130845@subnode.epcc.ed.ac.uk>
To: hpff-external@cs.rice.edu
Subject: hpff-external: Addressing mixed case external routine names
Cc: hjr@think.com
From: Harvey Richardson <hjr@think.com>
Organisation: Thinking Machines Corporation
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

I would like to propose an extension to the syntax of the extrinsic
interface definition to address the significant case issue.

An example being

   interface
     extrinsic(ANSI_C) subroutine update(widget) name("XmUpdateDisplay")
     ...

The NAME clause defines the (case significant) external routine name.
This routine is accessed from Fortran by the subroutine name - "update"
in this example.  The quotes are there so that we don't have to 
extend case significance in Fortran.

Harvey Richardson
Customer Support Group
Thinking Machines Corporation


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Wed Sep 13 10:58:09 1995
Received: by cs.rice.edu (KAA21011); Wed, 13 Sep 1995 10:58:09 -0500
Received: from mail.think.com by cs.rice.edu (KAA20991); Wed, 13 Sep 1995 10:58:01 -0500
Received: from Delphi.Think.COM by mail.think.com; Wed, 13 Sep 95 11:57:49 -0400
Received: by delphi.think.com (4.1/Think-1.2)
	id AA22176; Wed, 13 Sep 95 11:57:48 EDT
Date: Wed, 13 Sep 95 11:57:48 EDT
Message-Id: <9509131557.AA22176@delphi.think.com>
From: Doug MacDonald <macdon@think.com>
To: baden@cs.ucsd.edu
Cc: schreibr@frey.riacs.edu, zosel@llnl.gov, hpff-external@cs.rice.edu,
        baden@cs.ucsd.edu, presberg@tc.cornell.edu
In-Reply-To: Scott B. Baden's message of Tue, 12 Sep 1995 17:06:51 -0700 (PDT) <9509130006.AA20425@gili>
Subject: hpff-external: Descriptors
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

   From: baden@cs.ucsd.edu (Scott B. Baden)
   Date: Tue, 12 Sep 1995 17:06:51 -0700 (PDT)

   Rob,

   As a follow on to this: what about the converse, that is
   calling HPF from another languge (e.g. C++)? In this case
   HPF is viewed as the extrinsic language.

   This has gotten somewhat bogged down (apparently in part
   due to leadership changes in the C++ committee),
   but it would be nice to discuss this next week.

   I've had some prelimenary discussions with Chuck.
   It appears that Annex A is a good source of ideas for
   deriving  the facilities that would be needed on the C++ end.
   (This isn't surprising since we are in effect providing
   facilities that mirror the HPF-to-external interface)

   To get the ball rolling.  Let's say we have an SPMD program,
   that is a group of processors (e.g. an MPI communicator)
   all executing the same program in loosely synchronous fashion.

This is very much along the lines of what I have in mind.  However, I think
that Chuck K. had a good point in an earlier communication when he said
that it would be difficult (and possibly a bad idea as well) to try to get
implementors to standardize on the _implementation_ of a descriptor set.

For one thing, depending on what language features belong to the subset of
HPF implemented by the compiler used to compile the HPF routine called by
the C routine, the descriptor set may be more or less complex.  For
example, if CYCLIC and CYCLIC(n) distributions aren't supported, or if
non-unit-strided and offset alignments aren't supported, the descriptor set
might be much simpler.  Standard creation/inquiry functions probably are
more realistic.

Doug

   What we need need an etry to initialize the HPF run time system
   i.e. to identify global processor arrangments.  We also need
   a routine much like your create_mapped_array_descriptor(), with arguments
   that specify the global shape, the processor array, distribution
   and alignment and so on.

   Our SPMD program collectively allocates mapped arrays,
   collectively calls HPF routines.  There would have to be
   some agreements worked out,  i.e. that on entry to an HPF routine
   from C++ a scalar must have the same value on all processors,
   (this mirrors the languge in Annex A for calling C from HPF)
   and that mapped arrays called from C++ would
   have to promise to match the declarations on the HPF end and so on.

   We can also support task parallelism, through the use of
   multiple MPI communicators.  Then we could think of multiple
   SPMD programs calling different HPF routines.  

   Though the facility should work in principle with any language,
   at this point I would tend to be biased to C++.

   Comments?

   Scott

   >  
   >  ---------------------------------------------------------------------------
   >  hpff@cs.rice.edu is a mailing list for announcements related to High
   >  Performance Fortran.  Instructions for adding or deleting yourself
   >  from this list appear at the bottom of this message.
   >  ---------------------------------------------------------------------------
   >  
   >  I think that a standard descriptor format would be
   >  a good idea, but it should be in the form of a fortran 90
   >  derived type, that can be replicated, for example, and passed as
   >  an ordinary parameter to an extrinsic routine; then a library
   >  routine would be used to set it up:
   >  
   >  	interface
   >  	   extrinsic (c_local) function c_local_fn(desc)
   >             integer c_local_fn
   >             type (mapped_array_descriptor) desc
   >          end interface
   >  
   >  	type (mapped_array_descriptor) x_desc
   >  	real, x(100,200)
   >  !hpf$ distribute x(block, *)
   >  
   >  !
   >  !   call an hpf library routne to create the exportable 
   >  !   array descriptor for x
   >  !
   >  	x_desc = create_mapped_array_descriptor(x)
   >  	call c_local_fn(x_desc)
   >  
   >  ---------------------------------------------------------------------------
   >  To (un)subscribe to this list, send mail to hpff-request@cs.rice.edu.  Leave
   >  the subject line blank, and in the body put the line
   >  (un)subscribe <email-address>
   >  ---------------------------------------------------------------------------
   >  


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Fri Sep 15 15:40:58 1995
Received: by cs.rice.edu (PAA15832); Fri, 15 Sep 1995 15:40:58 -0500
Received: from timbuk.cray.com by cs.rice.edu (PAA15817); Fri, 15 Sep 1995 15:40:49 -0500
Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.6.12/CRI-gate-8-2.5) with ESMTP id PAA20524 for <hpff-external@cs.rice.edu>; Fri, 15 Sep 1995 15:40:49 -0500
Received: from hickory304 (meltzer@hickory304 [128.162.145.4]) by ironwood.cray.com (8.6.12/CRI-ccm_serv-8-2.8) with SMTP id PAA03965 for <hpff-external@cs.rice.edu>; Fri, 15 Sep 1995 15:40:47 -0500
From: Andy Meltzer <meltzer@cray.com>
Received: by hickory304 (5.x/btd-b3)
          id AA14365; Fri, 15 Sep 1995 15:40:45 -0500
Message-Id: <9509152040.AA14365@hickory304>
Subject: hpff-external: Intercallability proposal
To: hpff-external@cs.rice.edu
Date: Fri, 15 Sep 1995 15:40:44 -0500 (CDT)
X-Mailer: ELM [version 2.4 PL24-CRI-b]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------


Following is a latex version of the intercallability document
that I have promised.  Sorry for the delay.


						Andy Meltzer
						meltzer@cray.com

######################
\documentstyle[twoside,11pt]{article}

\pagestyle{myheadings}
\sloppy
\footheight=-.60in
\headheight=0in
\textwidth=5.50in
\topmargin=-.20in
\oddsidemargin=0.5in
\evensidemargin=0.5in

\itemsep=.05in
\topsep=.05in
\parsep=.05in
\parskip=.10in
\textheight=9in

\newcounter{dofigures}  \setcounter{dofigures}{0}
\newcounter{dotables}        \setcounter{dotables}{0}
\newcommand{\havefigures}{\setcounter{dofigures}{1}}
\newcommand{\havetables} {\setcounter{dotables}{1}}

\newcommand{\cccdocid}[5]{
  \bibliographystyle{alpha}
% ******************** Title Page ********************
  \vspace*{.5in}
  \centerline{\Large \bf {#1}}
  \vspace*{.02in}
  \centerline{{#2}}
  \vspace*{.02in}
  \centerline{{#3}}
  \vspace*{.02in}
  \centerline{{#4}}
  \vspace*{.02in}
  \centerline{{#5}}
  \vspace*{.5in}

  % ******************** List of Figures ********************
  % Note: Remove the following line if there are no figures
%  \ifodd\value{dofigures} \listoffigures \newpage \fi
%  \ifodd\value{dotables}  \listoftables  \newpage \fi
%  \markboth{#1} {#3}
%  \pagenumbering{arabic}
  }

% ******************** Special macros ********************
%\catcode`^^Z=9                % Make TeX ignore IBMPC EOF character

\newcommand{\ital}[1]{{\it #1}}
\newcommand{\type}[1]{{\tt #1}}
\newcommand{\bold}[1]{{\bf #1}}
\newcommand{\dfn}[1]{{\it #1}\index{#1}}
\newcommand{\comment}[1]{}

\def\fineprint{\relax}
\def\stopfine{\relax}


\newcommand{\beginprog}{\begin{tabbing}
xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=xxxx\=\kill
}
\newcommand{\tb}{\>\tt}
\newcommand{\finprog}{\end{tabbing}}
\newcommand{\progtxt}[1]{``{\tt #1}''}
\newcounter{save_enum}
\newcommand{\hilite}[1]{ \item {\bf #1\\} }
\newcommand{\morelater}{\centerline{$\ldots$\ital{Fill this in later}$\ldots$}}


% ******************** Start of Document ********************
\begin{document}

\cccdocid
  {HPF calling C Interoperability Proposal}
  {Ver 1.1}
  {Andrew Meltzer}
  {Cray Research, Inc.}
  {September, 1995}


\section{Introduction}

Defining an interoperability mechanism is as much a problem 
of recognizing the limitations of interlanguage communication as it is of 
solving the technical interlanguage calling issues. 

This document contains a proposal for an HPF calling C interoperability
mechanism and a rational for the limitations and techniques used in the
proposal.   It also contains a set of representative calls from the
X Windows interface for context.


\subsection{Defining the Bounds of the Problem}

There are a number of ways in which the interoperability question has
been viewed:

\begin{enumerate}
\item\label{i1} HPF should be able to call any C routine and be able to
           represent or convert any C type. 

\item\label{i2} HPF should be able to deal with any C type that has a
           general approximation in Fortran.

\begin{enumerate}
\item\label{i2a}   The compiling system should handle all type
                   conversions.  This includes remapping structures
                   and transposing arrays.

\item\label{i2b}   The compiling system should automatically convert
                   all ``basic'' types to their closest equivalent in the
                   other language.
\end{enumerate}

\item\label{i3} HPF programs calling C do not need any special
           mechanisms.  It is the problem of the application programmer
           to make sure that values are passed properly.
\end{enumerate}


Option \ref{i3} is currently the standard method, but since most application
programmers are looking for more ease-of-use, this option is not sufficient.

Option \ref{i1} is an enormous effort and might impose a large runtime
penalty.  To achieve this goal, Fortran would need to be extended to have
some new types (bits, unsigned values, and perhaps unions) and the
compiling system would need to transpose arrays on calls and 
convert complicated structures to their C layouts.  

Option \ref{i2a} is unreasonable for some of the reasons that option 
\ref{i1} is unreasonable.  This sort of type conversion/mapping is 
often not
desired and may impose a huge and unavoidable runtime penalty on the
user of the interface.  Consequently I believe that option \ref{i2b} is as
much as a compiling system should offer.  Users would be required to
understand layout of arrays in memory for each language and the way
structures are constructed.

The solution proposed below is a variety of option \ref{i2b} which 
incorporates some of the features of \ref{i2a}, but only at the application 
programmer's discretion.


\section{The X Window Interface}

The ability to call the X Windows library has often been mentioned as a
good test of the sufficiency of an interlanguage interface for HPF and C.  
I have picked a few representative library routines to demonstrate
the problems that must be dealt with.

\begin{verbatim}
        char * XOpenDisplay(char *display_name)
\end{verbatim}

Clearly any interface will have to deal with a conversion from Fortran
character strings to the {\tt char *} of C.  Somewhat more complicated might
be a call like

\begin{verbatim}
        XGetIconName(Display *display, Window w, char **icon_name)
\end{verbatim}

For this call, as in most, the display value is simply a pointer
and need not be inspected by the HPF
compiler.  {\tt Window} is an ID (some integer type), and {\tt icon\_name} is the
result value and is consequently a pointer to a {\tt char *}.   A problem
which must be addressed is that of pointers to pointers in C (though in
this case one might reasonably consider {\tt char *} to be the 
character string type of C so that a pointer to it 
is the equivalent of a pointer to a
Fortran character string.)  In many other calls, pointers to structure
pointers are the return value, for example {\tt size\_list} in the following:

\begin{verbatim}
        XGetIconSizes(Display *display, Window w,
                      XIconSize **size_list, int *count)
\end{verbatim}


Finally a call with some simpler parts:

\begin{verbatim}
        XStoreNamedColor(Display *display, Colormap cmap, char *colorname,
                         unsigned long pixel, int flags)
\end{verbatim}

{\tt Display} has been previously discussed. {\tt Colormap} 
can be stored in an 
{\tt int} type of the proper size. The {\tt char *} type has 
been discussed.  {\tt Unsigned long} should have 
some equivalent representation for Fortran 90 on any 
given compilation system, and finally {\tt int} is some size C integer.


\section{Interlanguage Calling Proposal}

This section describes the syntax and semantics for the interlanguage
calling sequence.  The interlanguage calling sequence is 
implemented with the the extrinsic mechanism.
A new extrinsic type
\begin{verbatim}
        EXTRINSIC (C) 
\end{verbatim}
is added to the language.  The extrinsic declaration interface block must
be visible to the compilation system at the time the call to the C function
is compiled.

It is important to fully understand what the {\tt EXTRINSIC}
mechanism is.  The {\tt EXTRINSIC} mechanism ``allows an HPF programmer
to declare a calling interface to a non-HPF subprogram'' (HPF Version 1.1,
Annex A.)  In this case the {\tt EXTRINSIC} mechanism assists the programmer
when calling a C function or subroutine by redefining the semantics
of a function or subroutine call.  Within the {\tt EXTRINSIC} the
code need not conform to HPF semantics, though it is important from an
implementation standpoint that it be easily parsed by an HPF parser.  
This proposal extends the semantics of the {\tt EXTRINSIC} mechanism
only a small amount.

This proposal also extends the HPF language with a new intrinsic function
(although it is already in common usage):
\begin{verbatim}
        LOC
\end{verbatim}

{\tt LOC} is used at the call site to indicate that the address of the
actual argument is required.

Within the {\tt EXTRINSIC (C)} environment a new attribute is defined:

 {\tt MAP\_TO([C\_TYPE= ]} {\it c type specifier } {\tt [, [LAYOUT= ]}{\it layout specifier}{\tt ])}

This attribute may have one or two specifiers, {\tt C\_TYPE} a 
required specifier,
and {\tt LAYOUT} an optional specifier.  {{\tt C\_TYPE} specifies the 
C type that the HPF type will be mapped to, and {\tt LAYOUT} 
indicates the memory layout of the data being passed.  If no {\tt LAYOUT}
argument is given it defaults to {\tt NO\_CHANGE}.  The following are
the layout specifiers an implementation is required to provide:

\begin{verbatim}
        Name               Function
        ----               --------
        C_ARRAY            Convert the array to a C memory layout
        HPF_ARRAY          Convert the array to an HPF memory layout
        NO_CHANGE          Do no rearrangement of the array memory layout
\end{verbatim}


The following are the type specifiers that an implementation is required
to provide.

\begin{verbatim}
        Name                            C type
        ----                            ------
        NO_CHANGE                       no remapping is done
        INT                             int
        LONG                            long
        SHORT                           short
        CHAR                            char
        FLOAT                           float
        DOUBLE                          double
        LONG_DOUBLE                     long double
        CHAR_PTR                        indicates that the string will be 
                                        converted to a null terminated char *

\end{verbatim}

The vendor may also add other type specifiers.

If a C function using the {\tt EXTRINSIC (C)} interface
is called from within an HPF program then the interface
uses the same rules as {\tt HPF\_SERIAL} to remap data.  It may also be called
from within an {\tt HPF\_LOCAL} environment, in which case a different
instance of the interface is
invoked (a different instance of the C function is called) on each processor.

HPF is not case sensitive, C is case sensitive.  
This proposal accepts the suggested addition of a {\tt NAME} clause
to be added to the {\tt EXTRINSIC} statement.  If this is added I suggest
that it is an ideal way to distinguish the case of the C function.

 {\tt NAME(`}{\it function name}{\tt')}

For example:

\begin{verbatim}
       EXTRINSIC (C) FUNCTION CASE() NAME("CaSe")
\end{verbatim}


\subsection{Explanation of the proposal}

The declaration {\tt EXTRINSIC (C)} indicates to the compilation
system how to convert types (in C parlance use a cast to convert types) 
to the types that the called C program requires.  It does this with
the {\tt MAP\_TO} attribute.  The implementation must also pass
these arguments by value, or as is appropriate for the vendors implementation
of C.  

The vendor must provide the set of required type specifiers
and may add as many additional type specifiers as it wishes.
The vendor may also optionally provide the extrinsic interface for 
the C system or library routines as well.

To handle pointers and pointers to pointers this proposal takes a
mechanism from common practice, the {\tt LOC} intrinsic.  If, for
example, a C function required an argument to be passed as a 
pointer to a pointer:

\begin{verbatim}
       void cfunc(int **pp);
\end{verbatim}

then the corresponding call (with, of course, the proper extrinsic
declaration visible) would be:

\begin{verbatim}
       INTEGER PP
       CALL CFUNC(LOC(LOC(PP)))
\end{verbatim}

Since the called C function requires a pointer to a 
pointer to be passed, using the {\tt LOC} mechanism here does not
hinder portability.

In addition, HPF must recognize a new attribute within {\tt EXTRINSIC (C)},
the {\tt MAP\_TO} attribute.  This attribute is the core of the mechanism.
It allows the user to specify the conversions necessary to facilitate
the interlanguage calls.   The attribute has one or two specifiers.  The first
specifier indicates the type of the C data that the HPF data must be
converted to.  The second specifier indicates (for an array) whether
and how to remap the storage.  {\tt HPF\_ARRAY} indicates that the 
runtime system should re-structure an array in memory (this is only
useful for a return value from a called C routine) to conform to the 
declared layout of the array that will contain the result of the function.
{\tt C\_ARRAY} indicates that the
runtime system should re-structure an array in memory to conform to
the system's default C style layout (this is only useful for arguments.)
For example, a user would
write the following within the {\tt EXTRINSIC (C)} interface to ensure 
that the compiler remaps the array {\tt a}
to the system's default C array layout:

\begin{verbatim}
         INTEGER, MAP_TO(C_TYPE=INT, LAYOUT=C_ARRAY)  a(100, 10, 4)
\end{verbatim}

The {\tt LAYOUT} argument is
not strictly required to make the interface work, but its functionality
is often enough utilized that there should be a portable way of specifying
the transformations.  

It is important to emphasize that the extrinsic environment does more
than just type checking.  In the presence
of {\tt EXTRINSIC (C)} and the {\tt MAP\_TO} attribute, type conversion 
is done.   The implementation 
might warn a user about a real to integer type conversion, but in 
general the interface will use the types specified by the {\tt MAP\_TO}
attribute in the interface as target 
types for the call and do whatever it takes to make the types passed 
in fit.   This only applies to the system's basic types, however.
If the {\tt MAP\_TO} attribute is not used then the storage is passed
through the interface as is, though it is passed by value rather than
reference if the implementation defines a C interface in such a way.

The vendor must provide an {\tt EXTRINSIC (C)} to interface 
with the native associated C compiler, but may also feel compelled 
to provide an extrinsic {\tt (GNU\_C)} or {\tt (CRI\_C)} as well, if 
the C compiler provided by those
vendors is available.  

For arrays and user defined types nothing is done by default in 
the interface, though for arrays the user may specify some transformations using
the {\tt LAYOUT} specifier.  It
is the responsibility of the user to make sure that a structure will
look exactly like the implementation's C structure (unless a specifier
for the structure has been provided by the vendor.)  This 
may mean OR'ing bits together into integers, adding padding, switching 
the order of data in memory, etc.  It is also the user's responsibility to 
ensure that
the row/column major properties of all arrays are transposed properly, etc.,
for the call when {\tt LAYOUT} is not used.

Type checking between the declared arguments in the extrinsic interface 
block and the calling routine follow Fortran 90 rules:  types must match
exactly.  The conversions that occur are between the declarations in the
interface block and the types specified by the {\tt MAP\_TO} attribute.

\subsection {Types}

\begin{itemize}
\item      {\bf Unsigned:} \linebreak
           Fortran does not have unsigned types.  Attempting to call a C
           routine which requires all the bits of a given integer size to
           be set properly will require a Fortran programmer to deal with
           negative numbers.  This cannot be avoided because of
           representational limits in the Fortran language and is not
           really an interface issue.  Unsigned types can be dealt with
           as integers. 

\item      {\bf Bitfields:} \linebreak
           Same as unsigned, but Fortran 90 does have routines that can 
           manipulate bits in integer types.

\item      {\bf char*:} \linebreak
           There is a 99\% solution which is viable.  The HPF program
           can treat a null as the end of a string and the interface can
           therefore easily deal with the string conversion problem for
           scalars, adding
           a null when calling the C routine (the programmer would be
           required to ensure that there is space to add it) and removing
           the terminating null on return.  Character strings such as
	   ``a$\backslash$0b$\backslash$0c'' for example may not be 
           passed through this interface.  Anything other than scalar
           data (e.g. arrays and structures) is the user's problem.

\item      {\bf Enums:} \linebreak
           The solution to this is implementation dependent because
           of the varied ways in which a C implementation can deal with them.
           The vendor could probably define a set of 
           type specifiers that would cover all of the implementation's ranges
           if it did not map enums to int.

\item      {\bf Unions:} \linebreak
           These are not really a problem.  If the HPF side wants to
           access fields of some part of the union, it can just declare
           something the size of the largest part of the union putting
           in pads where necessary.   Or the {\tt SELECT CASE} construct could
           be used and the Fortran application might be able to simulate
           unions.  This falls under the category of user required mappings.

\item      {\bf Pointers:} \linebreak
           The {\tt LOC} intrinsic
           is used to solve the pointer problem.  This 
           allows the caller to pass the
           address of an argument rather than the argument itself.
           {\tt LOC(LOC())} is used to indicate a pointer to a pointer.
           In this case the compiling system automatically allocates 
           a temporary to hold the new storage required.  The implementations
           only need to deal with C style pointers.  The data that the 
	   pointer points to must be contiguous.  In Fortran terminology,
	   the target of the pointer must be scalar or if it points to 
	   an array the user must ensure that the array is contiguous.
\end{itemize}
        
\section{Example}

Note that this example is for a 64-bit machine.

For this example the application programmer will be passing a user 
defined type as well
as a number of basic types. The type below is defined by the 
application programmer.  The 
application programmer has determined that
the data for the call to the C program should be laid out as this
Fortran implementation lays out {\tt MY\_TYPE} in memory.  The proper
layout of this user defined structure is the responsibility of the 
application programmer.

\begin{verbatim}
MODULE USER_DEFINED_THIS
  TYPE MY_TYPE
    INTEGER                int1
    REAL(KIND=KIND(1.0))   real1
  END TYPE MY_TYPE
END MODULE USER_DEFINED_THIS

\end{verbatim}

The interface below is the actual interface for the call to a function
named {\tt cfunc}.  

\begin{verbatim}
INTERFACE
   EXTRINSIC (C) FUNCTION CFUNC(w, x, y, z, a, p) NAME('cfunc')
      use HPF_C_INTERFACE
         REAL, MAP_TO(float)                cfunc
         TYPE (MY_TYPE)                     w
         INTEGER, MAP_TO(short)             x
         REAL, MAP_TO(float)                y
         INTEGER, MAP_TO(char)              z
         INTEGER(KIND=4), MAP_TO(short)     a(100)
         INTEGER                            p
   END FUNCTION
END INTERFACE
\end{verbatim}


Finally, the call is shown below with some of the Fortran data declarations
for the types that are passed.   Recall that the interface block above must
be visible to this call.

\begin{verbatim}
The call:

   TYPE (MY_TYPE)         w
   INTEGER                z, x
   REAL                   y
   INTEGER (KIND=4)       a(100)
   INTEGER                p
    .
    .
    .

   r = cfunc(w, x, y, z, a, LOC(p))
\end{verbatim}

Below is the prototype for the C routine.  This is not visible to the
Fortran code and is shown here for completeness.

\begin{verbatim}

  float cfunc(struct my_type w, short x, float y, char z,
              short a[100], int *p);


\end{verbatim}

This call to {\tt cfunc} causes the implementation to pass the
arguments {\tt w}, {\tt x}, {\tt y}, {\tt z} by value.  Further, 
{\tt x} is converted to a {\tt short} and {\tt z} is converted to 
a C {\tt char}.  {\tt a} is passed as a pointer (the way arrays in
this implementation's C are passed,) but is left as is in memory
since arrays are not automatically converted.  {\tt p} is passed
as a pointer since at the call site {\tt LOC} intrinsic is used.


\end{document}

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Sun Sep 17 21:33:27 1995
Received: by cs.rice.edu (VAA09248); Sun, 17 Sep 1995 21:33:27 -0500
Received: from odin.ucsd.edu by cs.rice.edu (VAA09242); Sun, 17 Sep 1995 21:33:19 -0500
Received: from gili.ucsd.edu by odin.ucsd.edu; id AA06512
	sendmail 5.67/UCSDPSEUDO.4-CS via SMTP
	Sun, 17 Sep 95 19:33:18 -0700 for chk@cs.rice.edu
Received: by gili (5.67/UCSDPSEUDO.4)
	id AA21553 for hpff-external@cs.rice.edu; Sun, 17 Sep 95 19:33:16 -0700
From: baden@cs.ucsd.edu (Scott B. Baden)
Message-Id: <9509180233.AA21553@gili>
Subject: hpff-external: Re: Descriptors
To: schreibr@frey.riacs.edu (Rob Schreiber)
Date: Sun, 17 Sep 1995 19:33:15 -0700 (PDT)
Cc: chk@cs.rice.edu, hpff-external@cs.rice.edu,
        baden@cs.ucsd.edu (Scott Baden)
In-Reply-To: <199509131823.LAA04111@frey.riacs.edu> from "Rob Schreiber" at Sep 13, 95 11:23:54 am
X-Mailer: ELM [version 2.4 PL23]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 8203      
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

Here is a very rough  proposal for calling HPF from an extrinsic SPMD program.
I'll call this "HPF to SPMD" for short.

We going to need to work with the MPI and C++ folks as well as
some of the vendors who are writing the compilers.


Proposal for calling HPF from an extrinsic SPMD program.
	Scott Baden, Chuck Koelbel, Rob Schreiber


In some applications it is useful to handle coordination and data
motion in an SPMD "harness" written in an extrinsic language, and to
invoke HPF from this harness to handle computation.  This facility mirrors the
"HPF-to-SPMD"  facility described in Annex A of the HPF spec, and
"mirrors" many of the ideas described therein. We also get a limited
form of task parallelism if we admit MPMD (multiple data multiple
program) parallelism.


We can think of an SPMD program in terms of an MPI communicator:
loosely synchronous execution of the same program on multiple
processors. (Similarly with MPMD: different programs running on
different processors.)

Since MPI allows one to configure a processor topology for a
communicator (using the cache mechanism) it makes sense to identify an
MPI communicator topology with an HPF PROCESSORS spec.  The advantage
of this approach is that the values returned by NUMBER_OF_PROCESSORS
and PROCESSORS_SHAPE are well-defined.  Note that it is possible to
dynamically change the members of a communicator or the processor
topology.  The consequences of this will require some thought.


3 sets of entries are needed that enable the user to  interact with
HPF.  (This spec should be language independent in keeping with
tradition.) These entries should be called with "the same arguments at
roughly the same time," i.e. in loosely synchronous fashion, by all
members of a  communicator.

    The entries

I. hpf_init(), which establishes various state needed by the
HPF run time system.

II.  create_mapped_array_descriptor(), which returns an opaque pointer
to a descriptor of a mapped array with specified shape,
alignment, distribution, processors spec, and so on.
(Rob also mentions an inverse of create_mapped_array_descriptor.
I'll get to this later on.)


III. various query routines that enable the user to access a mapped
array using an interface defined by HPF.  In effect these
are "mirror images" of routines defined in  Annex A of the HPF spec.


Details.


I.  HPF_INIT().  I'll punt on this for now.


II.  create_mapped_array_descriptor()


This routine is the workhorse of the interface.  We borrow the idea
from local HPF that a local grid is viewed in terms of a set of blocks,
and that there is a standard interface comprising routines that enable
the SPMD programmer to efficiently access and iterate over the data.
This is nicely defined in Annex A, which I adapt for the task at hand:


"All HPF arrays accessible from an SPMD procedure are logically carved
up into pieces; the SPMD process executing on a particular physical
processor sees an array containing just those elements of the global
array that are mapped to that physical processor.


"The model assumes that array axes are mapped independently to axes of a
rectangular processors grid, each array axis to at most one processors
axis (no "skew" distributions) and no two array axes to the same
processors axis.  This restriction suffices to ensure that each
physical processor contains a subset of array elements that can be
locally arranged in a rectangular configuration (of course to compute
the global indices of an element given its local indices or vice verse,
may be quite a tangled computation-- but it will be possible)

    Rob suggested::
>  
>  I think what we need is the inverse of create_mapped_array_descriptor.  Something
>  that is an ordinary struct or something in C++ that can be used to describe
>  to the HPF runtime (not the application) the shape and mapping of an array
>  that has been created by the C++ calling routine:
>  
>  
>     subroutine hpf_called_from_extrinsic(x_desc, x)
>  !
>  !  this routine is called collectively by an extrinsic that has passed in
>  !  both the mapped array X, and a descriptor of this mapped array
>  !
>  	type (mapped_array_descriptor) x_desc
>  	real x(:,:,:)   !   Use assumed shape array
>  	call describe_mapped_array_argument(x_desc, x)  ! Tell HPF about X

I'm not sure why we need this.


    ** SPMD execution model**


A communicator collectively allocates a mapped array by making a call
to create_mapped_array_descriptor(), which returns an opaque pointer.
HPF routines are called by all members of the communicator.  Opaque
pointers returned by create_mapped_array_descriptor() are passed as
arguments to an HPF routine.  All members of the communicator group
must pass the same opaque pointer in each corresponding position of the
argument list.  As far as the HPF routine is concerned, it won't know
the origin of the argument, which could have in fact originated from
another HPF program.


We must place certain restrictions on our SPMD/HPF model.

0. Calls to an allocation routine or to an HPF routine
    must be loosely synchronized.   Clearly all members of the communicator
    must EVENTUALLY call the HPF routine.

1. The SPMD program is responsible for passing global arrays to an HPF
   routine in conformance  with the routine's expectations, i.e. the distribution
   must match and so on.  (The only exception is when the HPF routine
   employs a transcriptive declaration.) This implies that we will
   need utilities to redistribute and remap existing array since
   the compiler can't be expected to handle this for us.

2. Actual scalar arguments must be consistent across the communicator.

(This statement mirrors that of Annex A, which pertains to the HPF
calling SPMD model:

    "Actual arguments corresponding to scalar dummy arguments are
	replicated (by broadcasting, for example) in all processors.")

3. Data motion between communicators must be handled at the SPMD level.
  This implies that global arrays allocated in different communicators
  cannot be passed to the same HPF routine.  (There there is language
  in Annex A that applies to the HPF to SPMD case.) 
   Note that different subgroups can call different HPF routines,
  and this admits task parallelism.


4. We may not want to permit an HPF routine that had been called FROM SPMD
   to in turn call SPMD at a lower level (i.e. HPF local) routine.
   (There's a similar but mirrored restriction in Annex A for calling
   an HPF local routine from HPF)

III. Query functions and other entries.
Required entries.
We need the following query routines, as described in Annex A:
 GLOBAL_DISTRIBUTION()
 GLOBAL_LBOUND()
 GLOBAL_UBOUND()
 GLOBAL_SHAPE()
 GLOBAL_SIZE()
 ABSTRACT_TO_PHYSICAL()
 PHYSICAL_TO_ABSTRACT()
 LOCAL_TO_GLOBAL()
 GLOBAL_TO_LOCAL()
 LOCAL_BLKCNT()
 LOCAL_LINDEX()
 LOCAL_UINDEX()


Note that employ opaque mapped array descriptors in place of the
(HPF) ARRAY arguments that the Annex specifies for these functions


We also need routines that handle HPF style REDISTRIBUTE,
and that perform I/O, should this be included in HPF 2.0.


 **** Matters to be resolved. ***
	

1. There can be several processor arrangements, with different ranks
and upper and lower bounds.  If a processor changed communicators (more
likely, if it made two calls, one as a member of each communicator),
then HPF_INIT would have to be called again.  The key constraint is
that the HPF subroutine can declare exactly one processors arrangement,
and the SPMD call would have to match this declaration.  The problem
is we can't pass a PROCESSORS arrangment as an argument
because these are static directives in HPF(Chuck Koelbel)

2.  Should HPF common blocks be accessible from SPMD?
    What are the consequences of this?  There may be a problem since it
isn't possible to specify data movement between communicators in HPF
code.  Presumably this implies that we can't access an array in COMMON unless
the element we want happens to be stored on a processor in the
appropriate group. (Chuck Koelbel)


3.  I/O should probably not be done from HPF directly.  In this case we
would need to provide all the HPF I/O facilities via specially defined
entries.
Comments?

Scott

---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Mon Sep 18 11:15:21 1995
Received: by cs.rice.edu (LAA28044); Mon, 18 Sep 1995 11:15:21 -0500
Received: from mail.think.com by cs.rice.edu (LAA28019); Mon, 18 Sep 1995 11:14:45 -0500
Received: from Delphi.Think.COM by mail.think.com; Mon, 18 Sep 95 12:14:15 -0400
Received: by delphi.think.com (4.1/Think-1.2)
	id AA27907; Mon, 18 Sep 95 12:14:15 EDT
Date: Mon, 18 Sep 95 12:14:15 EDT
Message-Id: <9509181614.AA27907@delphi.think.com>
From: Doug MacDonald <macdon@think.com>
To: baden@cs.ucsd.edu
Cc: schreibr@frey.riacs.edu, chk@cs.rice.edu, hpff-external@cs.rice.edu,
        baden@cs.ucsd.edu
In-Reply-To: Scott B. Baden's message of Sun, 17 Sep 1995 19:33:15 -0700 (PDT) <9509180233.AA21553@gili>
Subject: hpff-external: Re: Descriptors
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

   Here is a very rough  proposal for calling HPF from an extrinsic SPMD program.
   I'll call this "HPF to SPMD" for short.

   ...

Hi,

I've been working on a "local/global" programming model for Thinking
Machines and have some responses to this proposal.  In general I agree with
what it says.  I have some comments and suggestions throughout the text and
at the end.  There are probably other issues but these are the ones that
occur to me right now...

Regards,
Doug MacDonald

   We going to need to work with the MPI and C++ folks as well as
   some of the vendors who are writing the compilers.


   Proposal for calling HPF from an extrinsic SPMD program.
	   Scott Baden, Chuck Koelbel, Rob Schreiber


   In some applications it is useful to handle coordination and data
   motion in an SPMD "harness" written in an extrinsic language, and to
   invoke HPF from this harness to handle computation.  This facility mirrors the
   "HPF-to-SPMD"  facility described in Annex A of the HPF spec, and
   "mirrors" many of the ideas described therein. We also get a limited
   form of task parallelism if we admit MPMD (multiple data multiple
   program) parallelism.

Descriptor queries called locally in an HPF-to-SPMD local routine should be
via a standard function interface to which an opaque descriptor ID is
passed.

   We can think of an SPMD program in terms of an MPI communicator:
   loosely synchronous execution of the same program on multiple
   processors. (Similarly with MPMD: different programs running on
   different processors.)

   Since MPI allows one to configure a processor topology for a
   communicator (using the cache mechanism) it makes sense to identify an
   MPI communicator topology with an HPF PROCESSORS spec.  The advantage
   of this approach is that the values returned by NUMBER_OF_PROCESSORS
   and PROCESSORS_SHAPE are well-defined.  Note that it is possible to
   dynamically change the members of a communicator or the processor
   topology.  The consequences of this will require some thought.

Is it necessary to link HPF with MPI?  They both seem to be dominant
standards, but if it can be avoided, it should be.  Linking puts users at
the mercy of the correctness and performance of the MPI implementation to
which they have access.

   3 sets of entries are needed that enable the user to  interact with
   HPF.  (This spec should be language independent in keeping with
   tradition.) These entries should be called with "the same arguments at
   roughly the same time," i.e. in loosely synchronous fashion, by all
   members of a  communicator.

       The entries

   I. hpf_init(), which establishes various state needed by the
   HPF run time system.

   II.  create_mapped_array_descriptor(), which returns an opaque pointer
   to a descriptor of a mapped array with specified shape,
   alignment, distribution, processors spec, and so on.
   (Rob also mentions an inverse of create_mapped_array_descriptor.
   I'll get to this later on.)

I like the idea of a function interface with a subroutine
create_mapped_array_descriptor() defined as part of an HPF implementation.
In this way, descriptors compatible with the HPF descriptor set are
generated, giving what I call _language interoperability_.

Another issue is _memory interoperability_ HPF 1.0 at least doesn't mandate
a particular axis nesting order or even contiguity of local subgrids of a
global array.

Fortran subgrids are likely to be column-major.  In an HPF that allocates
local subgrids contiguously and in column-major order, This may be OK for a
Fortran 90 or f77 SPMD program.  But what about a C program?  You need to
specify a convention for what happpens to local row-major arrays.  I
suggest reversing the array's axis ordering on the global level.

Sometimes local subgrids are enlarged to receive boundary exchange values
to support stencils and similar codes.  In the best of worlds, the HPF
implementation will also use such expanded subgrids in its stencil
optimizations.  Such an HPF implementation must be able to support
non-contiguous memory layouts (successive columns owned by a given
processor are separated by the top and bottom boundary pointes).

Also, there should be a story about (a) how such expanded subgrids are
mapped to the global array's coordinate space, (b) when users must
guarantee subgrid boundary points to be consistent, e.g., just before a
call to global HPF.

Support for creating replicated arrays is important.

   III. various query routines that enable the user to access a mapped
   array using an interface defined by HPF.  In effect these
   are "mirror images" of routines defined in  Annex A of the HPF spec.


   Details.


   I.  HPF_INIT().  I'll punt on this for now.


   II.  create_mapped_array_descriptor()


   This routine is the workhorse of the interface.  We borrow the idea
   from local HPF that a local grid is viewed in terms of a set of blocks,
   and that there is a standard interface comprising routines that enable
   the SPMD programmer to efficiently access and iterate over the data.
   This is nicely defined in Annex A, which I adapt for the task at hand:


   "All HPF arrays accessible from an SPMD procedure are logically carved
   up into pieces; the SPMD process executing on a particular physical
   processor sees an array containing just those elements of the global
   array that are mapped to that physical processor.


   "The model assumes that array axes are mapped independently to axes of a
   rectangular processors grid, each array axis to at most one processors
   axis (no "skew" distributions) and no two array axes to the same
   processors axis.  This restriction suffices to ensure that each
   physical processor contains a subset of array elements that can be
   locally arranged in a rectangular configuration (of course to compute
   the global indices of an element given its local indices or vice verse,
   may be quite a tangled computation-- but it will be possible)

       Rob suggested::
   >  
   >  I think what we need is the inverse of create_mapped_array_descriptor.  Something
   >  that is an ordinary struct or something in C++ that can be used to describe
   >  to the HPF runtime (not the application) the shape and mapping of an array
   >  that has been created by the C++ calling routine:
   >  
   >  
   >     subroutine hpf_called_from_extrinsic(x_desc, x)
   >  !
   >  !  this routine is called collectively by an extrinsic that has passed in
   >  !  both the mapped array X, and a descriptor of this mapped array
   >  !
   >  	type (mapped_array_descriptor) x_desc
   >  	real x(:,:,:)   !   Use assumed shape array
   >  	call describe_mapped_array_argument(x_desc, x)  ! Tell HPF about X

   I'm not sure why we need this.


       ** SPMD execution model**


   A communicator collectively allocates a mapped array by making a call
   to create_mapped_array_descriptor(), which returns an opaque pointer.
   HPF routines are called by all members of the communicator.  Opaque
   pointers returned by create_mapped_array_descriptor() are passed as
   arguments to an HPF routine.  All members of the communicator group
   must pass the same opaque pointer in each corresponding position of the
   argument list.  As far as the HPF routine is concerned, it won't know
   the origin of the argument, which could have in fact originated from
   another HPF program.


   We must place certain restrictions on our SPMD/HPF model.

   0. Calls to an allocation routine or to an HPF routine
       must be loosely synchronized.   Clearly all members of the communicator
       must EVENTUALLY call the HPF routine.

   1. The SPMD program is responsible for passing global arrays to an HPF
      routine in conformance  with the routine's expectations, i.e. the distribution
      must match and so on.  (The only exception is when the HPF routine
      employs a transcriptive declaration.) This implies that we will
      need utilities to redistribute and remap existing array since
      the compiler can't be expected to handle this for us.

   2. Actual scalar arguments must be consistent across the communicator.

   (This statement mirrors that of Annex A, which pertains to the HPF
   calling SPMD model:

       "Actual arguments corresponding to scalar dummy arguments are
	   replicated (by broadcasting, for example) in all processors.")

   3. Data motion between communicators must be handled at the SPMD level.
     This implies that global arrays allocated in different communicators
     cannot be passed to the same HPF routine.  (There there is language
     in Annex A that applies to the HPF to SPMD case.) 
      Note that different subgroups can call different HPF routines,
     and this admits task parallelism.


   4. We may not want to permit an HPF routine that had been called FROM SPMD
      to in turn call SPMD at a lower level (i.e. HPF local) routine.
      (There's a similar but mirrored restriction in Annex A for calling
      an HPF local routine from HPF)

If there is to be full language interoperability, there shouldn't be any
restriction on SPMD -> global HPF -> SPMF or HPF_LOCAL calling.  This
restricts how HPF-compatible global library routines could be implemented
by third parties.  One very useful way to implement such library routines
is in SPMD.  If this call pattern is not permitted, HPF routines that call
SPMD routines could not themselves be called in an SPMD application.  Or
have I misunderstand this?

   III. Query functions and other entries.
   Required entries.
   We need the following query routines, as described in Annex A:
    GLOBAL_DISTRIBUTION()
    GLOBAL_LBOUND()
    GLOBAL_UBOUND()
    GLOBAL_SHAPE()
    GLOBAL_SIZE()
    ABSTRACT_TO_PHYSICAL()
    PHYSICAL_TO_ABSTRACT()
    LOCAL_TO_GLOBAL()
    GLOBAL_TO_LOCAL()
    LOCAL_BLKCNT()
    LOCAL_LINDEX()
    LOCAL_UINDEX()


   Note that employ opaque mapped array descriptors in place of the
   (HPF) ARRAY arguments that the Annex specifies for these functions


   We also need routines that handle HPF style REDISTRIBUTE,
   and that perform I/O, should this be included in HPF 2.0.


    **** Matters to be resolved. ***


   1. There can be several processor arrangements, with different ranks
   and upper and lower bounds.  If a processor changed communicators (more
   likely, if it made two calls, one as a member of each communicator),
   then HPF_INIT would have to be called again.  The key constraint is
   that the HPF subroutine can declare exactly one processors arrangement,
   and the SPMD call would have to match this declaration.  The problem
   is we can't pass a PROCESSORS arrangment as an argument
   because these are static directives in HPF(Chuck Koelbel)

   2.  Should HPF common blocks be accessible from SPMD?
       What are the consequences of this?  There may be a problem since it
   isn't possible to specify data movement between communicators in HPF
   code.  Presumably this implies that we can't access an array in COMMON unless
   the element we want happens to be stored on a processor in the
   appropriate group. (Chuck Koelbel)


   3.  I/O should probably not be done from HPF directly.  In this case we
   would need to provide all the HPF I/O facilities via specially defined
   entries.
   Comments?

   Scott

Another issue is _memory interoperability_ This is a bit more tricky, since
Fortran subgrids are likely to be column-major.  OK for a Fortran 90 or f77
SPMD program, but what about a C SPMD program?  You need to specify a
convention for what happpens to local row-major arrays.  If an HPF
implementation is to insist on column-major local memory storage, I suggest
a convention of reversing the array's axis numbering on the global level
when called from C.

Support for creating replicated arrays is important.

Sometimes local subgrids are enlarged to receive boundary exchange values
to support stencils and similar codes.  In the best of worlds, the HPF
implementation will also use such expanded subgrids in its stencil
optimizations.  There should be a story about (a) how such subgrids are
mapped to the global array's coordinate space, and (b) when subgrid
boundaries must be consistent, e.g., just before a call to global HPF.

Some global Fortran implementations that execute on a SIMD-like mode with a
control thread on a scalar processor which broadcasts RPC-like function
calls to the processing nodes, as in the CM-5.  In such implementations,
local subgrid addresses and any other local addresses that are broadcast
from the control thread to the nodes are (and must be) identical node to
node.  This is guaranteed by a memory management subsystem in the RTS that
is managed from the scalar processor.  But SPMD routines running on nodes
don't have access to this mm scheme, and malloc() will probably yield
inconsistent values across the nodes.  This is a real problem for these
implementations.  One solution is to provide also an alternate routine,
create_mapped_array_and_descriptor(), which allocates temp storage using
the HPF memory management subsystem and guarantees equality of subgrid
addresses.

Given an HPF data mapping, you need to be able to convert local coordinates
to global coordinates and back again, to determine what position along a
processors axis a given local coordinate falls on, and whether a given
global tuplet is local to the querying processor.  Some of this is done by
GLOBAL_TO_LOCAL() and LOCAL_TO_GLOBAL().  They need to work for all
supported data mappings.
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Mon Sep 18 13:20:20 1995
Received: by cs.rice.edu (NAA04870); Mon, 18 Sep 1995 13:20:20 -0500
Received: from frey.riacs.edu by cs.rice.edu (NAA04858); Mon, 18 Sep 1995 13:20:04 -0500
Received: by frey.riacs.edu (8.6.12/1.35)
	id LAA05941; Mon, 18 Sep 1995 11:24:05 -0700
Date: Mon, 18 Sep 1995 11:24:05 -0700
From: schreibr@frey.riacs.edu (Rob Schreiber)
Message-Id: <199509181824.LAA05941@frey.riacs.edu>
To: hpff-external@cs.rice.edu
Subject: hpff-external: Calling HPF from Local
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------


Here is a new proposal on a way to make an extrinic local code
the "master" and an HPF code the "slave".


From baden@cs.ucsd.edu Sun Sep 17 19:35:49 1995
Subject: Re: Descriptors
To: schreibr@frey.riacs.edu (Rob Schreiber)
Cc: chk@cs.rice.edu, hpff-external@cs.rice.edu,
        baden@cs.ucsd.edu (Scott Baden)
X-Mailer: ELM [version 2.4 PL23]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 8203      

Here is a very rough  proposal for calling HPF from an extrinsic SPMD program.
I'll call this "HPF to SPMD" for short.

We going to need to work with the MPI and C++ folks as well as
some of the vendors who are writing the compilers.


Proposal for calling HPF from an extrinsic SPMD program.
	Scott Baden, Chuck Koelbel, Rob Schreiber


In some applications it is useful to handle coordination and data
motion in an SPMD "harness" written in an extrinsic language, and to
invoke HPF from this harness to handle computation.  This facility mirrors the
"HPF-to-SPMD"  facility described in Annex A of the HPF spec, and
"mirrors" many of the ideas described therein. We also get a limited
form of task parallelism if we admit MPMD (multiple data multiple
program) parallelism.


We can think of an SPMD program in terms of an MPI communicator:
loosely synchronous execution of the same program on multiple
processors. (Similarly with MPMD: different programs running on
different processors.)

Since MPI allows one to configure a processor topology for a
communicator (using the cache mechanism) it makes sense to identify an
MPI communicator topology with an HPF PROCESSORS spec.  The advantage
of this approach is that the values returned by NUMBER_OF_PROCESSORS
and PROCESSORS_SHAPE are well-defined.  Note that it is possible to
dynamically change the members of a communicator or the processor
topology.  The consequences of this will require some thought.


3 sets of entries are needed that enable the user to  interact with
HPF.  (This spec should be language independent in keeping with
tradition.) These entries should be called with "the same arguments at
roughly the same time," i.e. in loosely synchronous fashion, by all
members of a  communicator.

    The entries

I. hpf_init(), which establishes various state needed by the
HPF run time system.

II.  create_mapped_array_descriptor(), which returns an opaque pointer
to a descriptor of a mapped array with specified shape,
alignment, distribution, processors spec, and so on.
(Rob also mentions an inverse of create_mapped_array_descriptor.
I'll get to this later on.)


III. various query routines that enable the user to access a mapped
array using an interface defined by HPF.  In effect these
are "mirror images" of routines defined in  Annex A of the HPF spec.


Details.


I.  HPF_INIT().  I'll punt on this for now.


II.  create_mapped_array_descriptor()


This routine is the workhorse of the interface.  We borrow the idea
from local HPF that a local grid is viewed in terms of a set of blocks,
and that there is a standard interface comprising routines that enable
the SPMD programmer to efficiently access and iterate over the data.
This is nicely defined in Annex A, which I adapt for the task at hand:


"All HPF arrays accessible from an SPMD procedure are logically carved
up into pieces; the SPMD process executing on a particular physical
processor sees an array containing just those elements of the global
array that are mapped to that physical processor.


"The model assumes that array axes are mapped independently to axes of a
rectangular processors grid, each array axis to at most one processors
axis (no "skew" distributions) and no two array axes to the same
processors axis.  This restriction suffices to ensure that each
physical processor contains a subset of array elements that can be
locally arranged in a rectangular configuration (of course to compute
the global indices of an element given its local indices or vice verse,
may be quite a tangled computation-- but it will be possible)

    Rob suggested::
>  
>  I think what we need is the inverse of create_mapped_array_descriptor.  Something
>  that is an ordinary struct or something in C++ that can be used to describe
>  to the HPF runtime (not the application) the shape and mapping of an array
>  that has been created by the C++ calling routine:
>  
>  
>     subroutine hpf_called_from_extrinsic(x_desc, x)
>  !
>  !  this routine is called collectively by an extrinsic that has passed in
>  !  both the mapped array X, and a descriptor of this mapped array
>  !
>  	type (mapped_array_descriptor) x_desc
>  	real x(:,:,:)   !   Use assumed shape array
>  	call describe_mapped_array_argument(x_desc, x)  ! Tell HPF about X

I'm not sure why we need this.


    ** SPMD execution model**


A communicator collectively allocates a mapped array by making a call
to create_mapped_array_descriptor(), which returns an opaque pointer.
HPF routines are called by all members of the communicator.  Opaque
pointers returned by create_mapped_array_descriptor() are passed as
arguments to an HPF routine.  All members of the communicator group
must pass the same opaque pointer in each corresponding position of the
argument list.  As far as the HPF routine is concerned, it won't know
the origin of the argument, which could have in fact originated from
another HPF program.


We must place certain restrictions on our SPMD/HPF model.

0. Calls to an allocation routine or to an HPF routine
    must be loosely synchronized.   Clearly all members of the communicator
    must EVENTUALLY call the HPF routine.

1. The SPMD program is responsible for passing global arrays to an HPF
   routine in conformance  with the routine's expectations, i.e. the distribution
   must match and so on.  (The only exception is when the HPF routine
   employs a transcriptive declaration.) This implies that we will
   need utilities to redistribute and remap existing array since
   the compiler can't be expected to handle this for us.

2. Actual scalar arguments must be consistent across the communicator.

(This statement mirrors that of Annex A, which pertains to the HPF
calling SPMD model:

    "Actual arguments corresponding to scalar dummy arguments are
	replicated (by broadcasting, for example) in all processors.")

3. Data motion between communicators must be handled at the SPMD level.
  This implies that global arrays allocated in different communicators
  cannot be passed to the same HPF routine.  (There there is language
  in Annex A that applies to the HPF to SPMD case.) 
   Note that different subgroups can call different HPF routines,
  and this admits task parallelism.


4. We may not want to permit an HPF routine that had been called FROM SPMD
   to in turn call SPMD at a lower level (i.e. HPF local) routine.
   (There's a similar but mirrored restriction in Annex A for calling
   an HPF local routine from HPF)

III. Query functions and other entries.
Required entries.
We need the following query routines, as described in Annex A:
 GLOBAL_DISTRIBUTION()
 GLOBAL_LBOUND()
 GLOBAL_UBOUND()
 GLOBAL_SHAPE()
 GLOBAL_SIZE()
 ABSTRACT_TO_PHYSICAL()
 PHYSICAL_TO_ABSTRACT()
 LOCAL_TO_GLOBAL()
 GLOBAL_TO_LOCAL()
 LOCAL_BLKCNT()
 LOCAL_LINDEX()
 LOCAL_UINDEX()


Note that employ opaque mapped array descriptors in place of the
(HPF) ARRAY arguments that the Annex specifies for these functions


We also need routines that handle HPF style REDISTRIBUTE,
and that perform I/O, should this be included in HPF 2.0.


 **** Matters to be resolved. ***
	

1. There can be several processor arrangements, with different ranks
and upper and lower bounds.  If a processor changed communicators (more
likely, if it made two calls, one as a member of each communicator),
then HPF_INIT would have to be called again.  The key constraint is
that the HPF subroutine can declare exactly one processors arrangement,
and the SPMD call would have to match this declaration.  The problem
is we can't pass a PROCESSORS arrangment as an argument
because these are static directives in HPF(Chuck Koelbel)

2.  Should HPF common blocks be accessible from SPMD?
    What are the consequences of this?  There may be a problem since it
isn't possible to specify data movement between communicators in HPF
code.  Presumably this implies that we can't access an array in COMMON unless
the element we want happens to be stored on a processor in the
appropriate group. (Chuck Koelbel)


3.  I/O should probably not be done from HPF directly.  In this case we
would need to provide all the HPF I/O facilities via specially defined
entries.
Comments?

Scott


---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

From owner-hpff-external  Wed Sep 20 01:55:01 1995
Received: by cs.rice.edu (BAA09205); Wed, 20 Sep 1995 01:55:01 -0500
Received: from VNET.IBM.COM by cs.rice.edu (BAA09192); Wed, 20 Sep 1995 01:54:53 -0500
From: zernik@VNET.IBM.COM
Received: from HAIFA by VNET.IBM.COM (IBM VM SMTP V2R3) with BSMTP id 8926;
   Wed, 20 Sep 95 02:54:29 EDT
Received: by HAIFA (XAGENTA 3.0) id 0744; Wed, 20 Sep 1995 08:54:43 +0200 
Received: by rs250-05.haifa.ibm.com (AIX 3.2/UCB 5.64/4.03)
          id AA27519; Wed, 20 Sep 1995 08:54:17 +0200
Date: Wed, 20 Sep 1995 08:54:17 +0200
Message-Id: <9509200654.AA27519@rs250-05.haifa.ibm.com>
To: hpff-external@cs.rice.edu, hpff@cs.rice.edu
Subject: hpff-external: HPF tools.
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------


Could anybody provide me some information about tools for HPF?
Automatic parallelization, performance tuning, debugging, testing,
whatever?

Does any of the existing HPF forums relate to these issues?

Thnx,
Dror
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------