From owner-hpff-external  Wed Jan  3 10:40:55 1996
Received: (from daemon@localhost) by cs.rice.edu (8.7.1/8.7.1) id KAA13022 for hpff-external-out; Wed, 3 Jan 1996 10:40:55 -0600 (CST)
Received: from timbuk.cray.com (root@timbuk.cray.com [128.162.19.7]) by cs.rice.edu (8.7.1/8.7.1) with SMTP id KAA13017 for <hpff-external@cs.rice.edu>; Wed, 3 Jan 1996 10:40:52 -0600 (CST)
Received: from ironwood.cray.com (root@ironwood-fddi.cray.com [128.162.21.36]) by timbuk.cray.com (8.6.12/CRI-gate-8-2.11) with ESMTP id KAA28541 for <hpff-external@cs.rice.edu>; Wed, 3 Jan 1996 10:40:51 -0600
Received: from hickory304 (meltzer@hickory304 [128.162.145.4]) by ironwood.cray.com (8.6.12/CRI-ccm_serv-8-2.8) with SMTP id KAA23303 for <hpff-external@cs.rice.edu>; Wed, 3 Jan 1996 10:40:50 -0600
From: Andy Meltzer <meltzer@cray.com>
Received: by hickory304 (5.x/btd-b3)
          id AA10392; Wed, 3 Jan 1996 10:40:46 -0600
Message-Id: <9601031640.AA10392@hickory304>
Subject: hpff-external: SPMD_HPF proposal
To: hpff-external@cs.rice.edu
Date: Wed, 3 Jan 1996 10:40:46 -0600 (CST)
In-Reply-To: <9510311703.AA16694@gili> from "Scott B. Baden" at Oct 31, 95 09:03:26 am
X-Mailer: ELM [version 2.4 PL24-CRI-b]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hpff-external
Precedence: bulk

---------------------------------------------------------------------------
hpff-external@cs.rice.edu is a mailing list for discussion of external
interfaces between HPF and other languages/systems.  Instructions for
adding or deleting yourself from this list appear at the bottom of this
message.
---------------------------------------------------------------------------

External Group -


As promised, here is my proposal for SPMD_HPF.  It is basically just
HPF_LOCAL with HPF 2.0 features grafted on, then a few extra features;
these are at the bottom of the document.


							Andy

#####################

Andy Meltzer 
Cray Research, Inc. 
655F Lone Oak Drive
Eagan, MN 55121

			SPMD_HPF

SPMD_HPF is a hybrid language, combining an SPMD execution model with
HPF 2.0 features.  The model combines the multi-threaded execution of
HPF_LOCAL and the HPF 2.0 syntax and features.  The goal of SPMD_HPF is
to attain the potential performance of an SPMD programming model with
access to HPF features and a well-defined extrinsic interface to HPF.


Introduction
------------

SPMD_HPF is a hybrid language, combining an SPMD execution model with
HPF 2.0 features.  The model combines the multi-threaded execution of
HPF_LOCAL and the HPF 2.0 syntax and features.  The goal of SPMD_HPF is
to attain the potential performance of an SPMD programming model with
access to HPF features and a well-defined extrinsic interface to HPF.
It is built on top of the HPF_LOCAL extrinsic environment.  This
language is based off of the current definition of HPF 2.0 and should
change as HPF 2.0 changes.

SPMD features and a multi-threaded model allow the user to take
advantage of the performance and opportunity for low level access of a
more general purpose programming model.  Including HPF 2.0 data
distribution features gives the programmer access to the highest
performing aspects of both models with the penalty of a somewhat more
complex model.  SPMD_HPF is not appropriate for all platforms, but is
consistent with HPF and easily targeted for platforms that have HPF and
can support SPMD programming styles.

The syntax of SPMD_HPF is a superset of the syntax of HPF 2.0 and the
extrinsic language's semantics are very similar to those of HPF. There
are some differences, however. For example, I/O causes differences; in
SPMD_HPF different PEs are allowed to read from different files at the
same time, in HPF the PEs must all read from the same file.   The
differences in the models are principally caused by the multi-threaded
execution model and the introduction of HPF_LOCAL data rules.

SPMD_HPF allows for the notion of  private data.  Data defaults to a
mapping in which data items are replicated, one per processor.  The
values of the individual replicated data items and the flow of control
may vary from PE to PE within SPMD_HPF. This behavior is consistent
with the behavior of HPF_LOCAL.  In SPMD_HPF a processor may be
individually named and code executed based upon which processor it is
executing on.


Execution Model
---------------

SPMD_HPF is built upon the fundamental execution model of HPF_LOCAL,
augmented with data mapping and work distribution features from HPF
2.0.  It is also augmented with many explicit low-level control
features, some taken from Cray Research's CRAFT language.

In SPMD_HPF all PEs begin executing in parallel, with data defaulting
to a replicated distribution, the default distribution for HPF_LOCAL.
Each PE gets a copy of the data storage unless specified otherwise by
the user.  Consequently I/O works identically to I/O in HPF_LOCAL and
message passing libraries are easily integrated.

In short, the execution model is that of HPF_LOCAL.


Data Mapping Features
---------------------

Data mapping feature syntax is identical to that in HPF 2.0.  The
semantics of the data mapping directives is also identical.

The only semantic difference (as mentioned above) is that there is a
default distribution in which data is replicated on the processors and
the values of the same data item on different processors may vary.
This is consistent with the HPF_LOCAL interpretation of the data
declaration.

When data is explicitly mapped, only one copy of the data storage is
created unless the explicit mapping directs otherwise.  The value of
explicitly mapped replicated data items must be consistent between
processors as is the case in HPF 2.0.

A new directive is suggested for completeness:  PE_PRIVATE, which
specifies that the data should conform to the default behavior.


Subprogram Interfaces
---------------------

Calling an SPMD_HPF Subprogram from SPMD_HPF
--------------------------------------------

The behavior and requirements of an SPMD_HPF program at subprogram
interfaces is identical to that of HPF 2.0 for dummy arguments that are
explicitly mapped.

All processors must co-operate in a subprogram invocation that remaps
or explicitly maps data.  In other words, if an explicit interface is
required (by the HPF 2.0 rules) or the subprogram declares explicitly
mapped data, the subprogram must be called on all processors.
Processors need not co-operate if there are only reads to non-local
data.

The user also has the option of passing data that has been explicitly
mapped in the caller to dummy arguments that are not explicitly mapped
in the callee.  The mapping rules for this data are identical to the
mapping rules when HPF calls an HPF_LOCAL routine.  The data remains
"in-place" and is interpreted differently.


Calling an SPMD_HPF Subprogram from HPF 2.0
-------------------------------------------

The calling convention and argument passing rules for SPMD_HPF are a
hybrid of those for HPF 2.0 calling HPF_LOCAL and HPF 2.0 calling HPF
2.0.  Explicit interfaces are required.  Where dummy arguments are
private (default) storage, the HPF calling HPF_LOCAL conventions are
used.  Where dummy arguments are explicitly mapped, the calling
convention matches HPF calling HPF.


Executable Statements
---------------------

The  INDEPENDENT directive
--------------------------

Two different forms of the  INDEPENDENT directive are proposed. The
first is like the use of  INDEPENDENT in HPF. The second incorporates
the  ON_HOME clause and has a number of differences to more easily
facilitate its use.
 INDEPENDENT without the  ON_HOME clause is identical to the current
HPF implementation of  INDEPENDENT.  Within these  INDEPENDENT loops
the values of private data may change.

 INDEPENDENT applied to  FORALL has identical syntax and semantics as
in HPF.


INDEPENDENT with the  ON_HOME Clause
------------------------------------

The  INDEPENDENT directive in SPMD_HPF may be applied to the first of a
group of tightly nested loops and may apply to more than one of them.
This more easily facilitates the use of the  ON_HOME clause.  The
current  INDEPENDENT directive applies only to a single loop nest.

The  INDEPENDENT directive is extended so that multiple loop nests can
be named.

The general syntax for these independent loops is as follows:

    !HPF$ INDEPENDENT (I1, I2, ..., In)  ON_HOME X($h1(I),h2( I),... ,hn(I)
          DO I1 = L1, U1, S1
            DO I2 = L2, U2, S2   
              DO In = Ln, Un, Sn
                ...
              END DO                            
            END DO                              
          END DO


The syntax and semantics of  INDEPENDENT with the  ON_HOME clause are
different from its syntax and semantics without the  ON_HOME clause.
With the
 ON_HOME clause the directive states that there are no cross-processor
dependencies, but there may be dependencies between iterations on a
processor. It also indicates which loop iterations it refers to.

If the  ON_HOME clause is used,  INDEPENDENT must be used in the
multi-line form.


The  NEW Clause
---------------

An HPF independent loop optionally may have a  NEW clause. The  NEW
clause is not required by SPMD_HPF because in SPMD_HPF data defaults to
replicated and values may differ from PE to PE.

Replicated data, however, has slightly different behavior than data
specified in the  NEW clause.  A the value of a private datum on each
PE can be used beyond a single iteration of the loop. The values of
data items named in a  NEW clause may not be used beyond a single
iteration. The  NEW clause asserts that the  INDEPENDENT directive
would be valid if new objects were create for the variables named in
the clause for each iteration of the loop.

The semantics of the  NEW clause are identical in SPMD_HPF and HPF 2.0.
The variables named in a  NEW clause apply only to the immediately
subsequent loop nest.


Array Syntax
------------

Array syntax is treated identically in SPMD_HPF as in HPF 2.0 for
explicitly mapped objects.  For replicated objects the behavior is
identical to that of HPF_LOCAL.   When private (default) objects and
explicitly mapped objects are combined the rules are as follows for the
example:


  result =  rhs1 op1  rhs2 op2 ... opm 

  -  If result is explicitly mapped and all rhs arrays are 
     explicitly mapped, the work is distributed as in HPF.

  -  If result is private and all rhs arrays are private the
     computation is done on all processors as an HPF_LOCAL program
     would do it.

  -  If result is private and all  rhs arrays are explicitly 
     mapped, the work is distributed as in HPF and the values of the 
     results are broadcast to the  result on each processor.

  -  If result is explicitly mapped and NOT all  rhs arrays are 
     explicitly mapped, the results of the operation are undefined.
     
  -  If result is private and some, but not all  rhs arrays are 
     explicitly mapped, the value is computed by on each processor
     and saved to the local  result.

For consistency, all processors must participate in any array syntax 
statement in which the value of an explicitly mapped array is modified.


Sequence and Storage Association
--------------------------------

Storage and sequence association rules are identical to HPF 2.0 for
explicitly distributed data.  Data that is private follows the rules
for ordinary Fortran 90 sequence and storage association.  This is
consistent with HPF_LOCAL.


Input and Output
----------------

Private I/O in HPF has sequential semantics, private I/O in SPMD_HPF
has parallel semantics; in other words, a private read in SPMD_HPF
requires each PE to read each element of data in a given file, while a
private read in HPF requires a single read by one PE and a broadcast of
that value (where necessary) to all other PEs. If the same file is
specified, both languages generate the same results (with great I/O
overhead in the SPMD_HPF case). SPMD_HPF allows each PE to open and
read from different files, a feature unavailable to HPF. Private writes
cause many more differences between the two languages, however.
The user must ensure that only one PE writes to a file using some sort
of synchronization in SPMD_HPF.


Serial Regions
--------------

Because of other features within the model it is often useful to enter
a region where only one process is executing.  This is particularly 
useful for certian types of I/O.   To facilitate this, two directives
are added:

  MASTER 
  END MASTER 


In addition, one may optionally attach a  COPY clause to the 
END MASTER directive which specifies the replicated (default) 
data items whose values should be broadcast to all processors.  
The syntax of this directive is:

  !HPF$ MASTER 
  !HPF$ END MASTER [, COPY (  var1 [ , var2, ... , varn  ])] 


where var is private date to be copied to other processsors.

Serial regions provide implicit barrier synchronization points
at both the  MASTER and  END MASTER directives.  

Serial regions can be nested, but inner directives are ignored.  There
must be a matching, properly nested  END MASTER directive for each
MASTER directive.

If a subroutine call occurs within a serial region, the subroutine
executes serially; there is no way to get back to parallel execution
within the subroutine.  All explicitly mapped data is accessible from 
within subroutines called in a serial region, but a subroutine called
from within a serial region cannot declare explicitly mapped data
or remap data.  All processors must participate in the invocation of 
the serial region.


Library and Intrinsic Routines
------------------------------

HPF Library
-----------

The HPF Library is available to SPMD_HPF when called with data that is
explicitly mapped and all processors are participating in the call.

Parallel Inquiry Intrinsics
---------------------------

These directives are provided as an extension to HPF.  They provide
information potentially useful to the programmer about the state of
execution in a program.

  IN_PARALLEL() 
  IN_INDEPENDENT() 


Task Identity
-------------

 PROCID from HPF_LOCAL is provided.


Synchronization Primitives
--------------------------

It is suggested that a number of synchronization primitives be provided
since this model can be programmed at a much lower level than HPF 2.0.
These primitives include:

 Locks (set, wait, clear)
 Critical Sections 
 Events (test, set, wait, clear)
 Barriers (test, set, wait)


Other
-----

In CRAFT we have found a number of other directives to be extremely
useful.  While these are not required by the model, we should 
consider them for inclusion, but none are required by this model.

Barrier Removal
---------------

It is occasionally useful for an advanced programmer to indicate 
to the compilation system where barriers may not be needed (even though the 
compiler might think that they are necessary,
based upon incomplete knowledge.)

    NO BARRIER 


Parallelism Specification Directives
------------------------------------

These directives allow a user to assert that a routine will only be
called from within a parallel region, a serial region, or from within
both regions.  Without these directives an implementation might be
required to generate two versions of code for each subroutine, depending
upon implementation strategies.  The directives simply make the 
generated code size smaller and remove a test.

  PARALLEL_ONLY 
  SERIAL_ONLY 
  PARALLEL_AND_SERIAL 


SYMMETRIC
---------

 SYMMETRIC data is private data that is guaranteed to be at the
same storage location on every processor.  The feature is obviously
tied to certain implementations, but does make PUT and GET functionality
much easier to deal with.

PE_RESIDENT
-----------

The  PE_RESIDENT directive can be applied to loops and at the
subroutine level.  It is an assertion that the accesses to a particular
variable in the subroutine (or loop) are only accesses to data that is
local to the processor making the assertion.  For example:

         REAL A(100), B(100) 
  !HPF$  DISTRIBUTE A(BLOCK), B(BLOCK) 
  !HPF$  PE_RESIDENT A, B 

indicates that only local elements of arrays  A and  B will
be accessed within the subroutine.  

Note that this is an assertion about the behavior of a program and
not a directive to make it so.


GEOMETRY
--------

The  GEOMETRY directive is like a mapping typedef, allowing the
user to conveniently change the mappings of many arrays at the same
time.  It is similar in many ways to the  TEMPLATE directive, but
since it is bound to no particular extent it is easier to apply in a
general way.  Users of CRAFT tend to heavily use this feature to quickly
distribute a set of arrays similarly.

The syntax of the  GEOMETRY directive is:

  !HPF$  GEOMETRY  geom(d1 [, d2, ..., dn]) 
  !HPF$  DISTRIBUTE  geom [::]  var1[, var2, ... , varm] 


Where di indicates one of the allowable distribution formats.


New Features of SPMD_HPF
------------------------

SPMD_HPF starts with the HPF_LOCAL extrinsic environment then adds all
of HPF 2.0.   This section lists the new features of SPMD_HPF.  


 - Changes to INDEPENDENT to better support  ON_HOME.  
 - New rules defining the interaction of explicitly mapped and private data.
 - Parallel inquiry intrinsics:
    - IN_PARALLEL()
    - IN_INDEPENDENT()
  
 - Serial regions ( MASTER / END MASTER)
 - Explicit synchronization primitives:
   - Locks (set, wait, clear)

   - Critical Sections 
   - Events (test, set, wait, clear)
   - Barriers (test, set, wait)
  
 - PE_PRIVATE directive to specify default data mapping behavior

 - Other suggested features:
  
   PARALLEL_ONLY 
   SERIAL_ONLY 
   PARALLEL_AND_SERIAL 
   PE_RESIDENT 
   SYMMETRIC 
   GEOMETRY
  
---------------------------------------------------------------------------
To (un)subscribe to this list, send mail to hpff-external-request@cs.rice.edu.
Leave the subject line blank, and in the body put the line
(un)subscribe <email-address>
---------------------------------------------------------------------------

