HETEROGENEOUS ELEMENT PROCESSOR
Document Type:
Collection:
Document Number (FOIA) /ESDN (CREST):
CIA-RDP86B00689R000300040031-0
Release Decision:
RIFPUB
Original Classification:
K
Document Page Count:
7
Document Creation Date:
December 20, 2016
Document Release Date:
July 17, 2007
Sequence Number:
31
Case Number:
Publication Date:
March 1, 1981
Content Type:
OPEN SOURCE
File:
Attachment | Size |
---|---|
CIA-RDP86B00689R000300040031-0.pdf | 1.25 MB |
Body:
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
Tomorrows Computer-.:
Is Here., Today
Denelcor's Heterogeneous,
Element Processor (HEP) is
a large-scale (64-bit) high-
speed digital..- computer
whose architecture makes all
other supercomputer ar-
chitecture obsolete. HEP
provides a totally new com
puting environment:.. high-...
speed, parallel processing of
heterogeneous- data ele-
ments. HEP has, been de-
signed for use~ in scientific
and/or commercial applica-.
tions which can. effectively
utilize processing speeds of
ten million to 160 million in-.-
structions per second. . HEP':
..achieves this throughput be-
cause of its design which im-
- plements the. Multiple In-
struction Stream Multiple
chitectural concept for the .:;:_
fiat time in a. commercially
user up to 1,024 indepen
dent instruction streams or pro- ;-
cesses, each with. its own data-stream to,..
be used concurrently for uses in pro-
gramming applications. This multiplicity
of instruction streams running in parallel
enables and encourages breaking, the application into its
component parts for parallel processing. Other features. of
the HEP design. provide the synchronization necessary to
facilitate cooperation between concurrent processes, and
eliminates the precedence delays which often occur when
parallel processing is attempted using more conventional
data processing equipment An equal number. of Super-
visory Processes are available for processing the privileged
functions necessary to the support of the User Processes
for a total of 2,048 independent instruction streams.
ro'd' e SSoT
The many capabilities of the:;
HEPhardware.. are _fully
supported' by. HEP system ,
Software so that. the- poten :;,
tial:.performance of the.sys-
tem.. is. realized:with relative-;;
ease.,. Using the--available
:System Software,:. pro-
gramming HEP,is verysimi "-_
lar to. programming a con- ,,
ventional. system,., and: only:
minimal: additional: pro-?
gram mer . training is ~ , re
quired _
In addition to the obvious::
design goals. :of . fast
throughput and the ability. to:-..--.
solve very -large and. com
plea:problems, HEP is, de-
signed for ease of. operation
and to be.: highly effective
of= -
across the full range
general-purpose. computing :".
applications
HEPhardware is modular
and field expandable_
HEP achieves its high speed
performance through ad-:
vaned architectural-.con
cepts rather. than: through un-'-'
proven "leading edge technology" elec
tronic components.. This provides the user
:benefits in economy and reliability-
HEP Parallel Fortran is designed for maximum similarity to -:
existing languages, with logical. extensions as necessary to-
implement the advanced features of HEP..
HEP is designed for ease of maintenance in the event of
hardware malfunction. Maintainability features are an in-
tegral part of the hardware design, including an on-board
maintenance diagnostic system which implements an In-
teractive Maintenance Language for diagnostic purposes.
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
Evolution of Computer
Architecture
The earliest computers executed a single instruction at a
time, using a single piece of data:. The architecture of these
machines, called SISD (for Single Instruction,. Single Data:
Stream) computers, was straight-forward, and well suited.
to the technology of the times.. As technology advanced
and computer users required greater performance, SISD
machines were made faster and faster, using newer and
Another approach to increasing the speed of computation I-
,- to make multiple copies of portions of the SISD
hardware. In this approach, called SIMD (for Single In..
struction, Multiple Data Stream), the operand fetch,:
execution and result store portions of the hardware. were
replicated, so that the execution of a single instruction
caused several values to be fetched, computed upon and
better components and designs. But a fundamental prob- the answers stored. For certain.problems, this provided a
lem remained.. Although the execution of a. computer in- substantial performance improvement.. With sufficient.,
struction is physically composed of several parts - instruc- hardware, entire- vectors of numbers could be operated
tion fetch, operand fetch, execution and result store - the upon simultaneously. However, as with "look-ahead" SISD computer could only perform one of these at a time, SISD machines,. the occurence of test and branch instruc
since each step depended on the completion of the previ- lions, among others, required the machine to wait for the-
ous one. Thus, three-fourths of the expensive. hardware total completion of the instruction before proceeding..The
stood idle at any given time, waiting for the rest of the- test and branch itself could. make no use of the replicated
hardware to finish operation. hardware.
Divide
SISD
Single Instruction,
Single Data Stream
SISD designers attempted to remedy this by a technique
called "look-ahead", in which instruction fetch for the next
instruction was overlapped with some portion of the
execution of the current instruction. This provided some
performance improvement. However, digital computer
programs, particularly those written in higher level lan-
guages, contain large numbers of test and branch instruc-
tions, in which the choice of the next instruction depends
on the outcome of the current instruction. In such cases,
"look-ahead" offers no speedup, and introduces substan-
tial complexity to make sure that the partial execution of
an incorrect next instruction does not contaminate the
computation.
In addition, two new problems were created by the SIMD
architecture. Substantial portions of most programs are
not vector-oriented. The computation of iteration variables
and array subscripts is a scalar problem, for which SIMD
offers no speedup, and the collection of operands across
arrays is an addressing problem which many SIMD
architectures do not handle. As a. second problem, if an
SIMD computer has a fixed quantity of replicated execu-
lion modules (adders, etc.), and if the length of the vector,
which the user wishes to operate on differs from the- vector
length of the machine, performance suffers and software
complexity increases. The cost of computation remains
high since the hardware is often not fully utilized.
SIMD
Single Instruction,
Multiple Data Strea
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
t. -
Den icot ':? c.
. Evolution of Computer
Architecture-
Continued difficulties with the implementation of high per-
formance, cost effective computation using single. instruc-
tion machines have led to the development of a new
concept in computer architecture.
This concept, called MIMD (for Multiple Instruction,',
Data Stream) architecture, achieves high performance
at low hardware cost by keeping all processor hardware
.utilized executing multiple parallel programs simulta-
neously. For example, while an add is in progress for one
process, a multiply may be executing for another, a divide
for a third; or similar functions may be executing simulta-
neously, such as multiple adds or divides- In MIMD ar-
chitectures, cooperating programs are often called "pro-
cesses". Independent. programs may contain one or sev-
eral processes.
processes. Since this arbitration of the state of memory
locations is handled by hardware and without affecting the
execution of unrelated instructions, the communication
delay. is short and the overhead is small.
MIMD computers may, be used to execute either SISD or
SIMD programs. SISD programs are just MIMD programs
with no inter-program communication. Execution of mul-
tiple identical MIMD programs is equivalent to execution of
an SIMD program.
In the SIMD case, MIMD computers may match the vector
lengths exactly, while using remaining resources for unre-
lated computation. Thus, high efficiency may .be main-
tained..even'through scalar portions of the code. But the
major application of MIMD computers lies in problems of
Multiply
add
Multiply.
Because the multiple instructions executed concurrently
by an MIMD machine are independent of each other,
execution of one instruction does not influence the execu-
tion of other instructions and processing may be fully
parallel at all times.
Successful MIMD architectures (figure 3) also provide
low-overhead mechanisms for inter-process communica-
tion. In these architectures, data locations may contain not
only a value but a state. Processes may synchronize by
waiting for input locations to have the "full" state. Result
storage may wait for output locations to attain the "empty"
state resulting from consumption of their contents by other
;ruction,
a Stream
Result
L
E+F
sufficient complexity that straightforward vector computa-
tion is not feasible. In these cases, which include continu-
ous simulation and complicated partial differential equa-
tion solutions, MIMD architecture offers the only possible
method of achieving significant parallelism. Denelcor's
Heterogeneous Element Processor system is the only
commercially available MIMD computer.
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
I 4'iUr"'-1P t :P'- r-Ps nr
Deneicor; Inca.
HEP Architecture
Diagnostic Maintenance,
Sub-System-.:: ,.-.
High-Speed File:
The HEP computer system consists of process execution
modules (PEMs), data memory modules and support pro-
cessors interconnected by a high-speed data switch net-
work. All data memory modules are accessible by all
PEMs. Thus, processes executing in parallel in one or
several PEMs may cooperate,. by reading and writing
shared information in the data memories. Parallel pro-
cesses synchronize and pass information back and forth
using the full/empty attribute of each data memory loca-
tion. HEP instructions may automatically wait for an input
data memory location to be full. before execution, and
leave the. location empty after execution. Instructions may
also wait for an output location to be empty before execu-
tion and leave it full after. execution-This communications
discipline allows processes to conveniently and unambigu-
ously pass information to other processes while executing.
The lull/empty attribute ensures that reads and writes of
inter-process variables will alternate and no information
will be lost. For locations used exclusively within a process,
the full/empty attribute is ignored and memory may be ac-
cessed conventionally.
Both normal and synchronized memory access are avail-
able to the Fortran programmer as well as the assembly
programmer.. Software modules in both Fortran and as-
sembler programs.may be distributed across several PEMs
to achieve increased throughput In general, design of a
parallel program is not affected by whether the program
will run in one or several PEMs.
HEP System
Structure
-------------
I/O -J9Interfaces
i r-;--
Cache Peripherals
i ? ~__~_-~
a-------------F4?-~ Sub-Systems
i External
1/0
DAC
ADC
Clocks
Discrete l..'O
In HEP, creation and termination of parallel processes in
an MIMD program is a hardware capability directly avail-
able to the programmer. Processes are created or termi-
nated in 100 nanoseconds by execution of a single HEP
instruction. Thus, processes may be created at any point in
a, program where additional parallelism is required, and.
terminated as soon as their function is accomplished. Up ta.
64 user processes may exist simultaneously within each
PEM in a HEP system.
In order to efficiently manipulate data, each PEM contains
2048 internal general purpose registers. PEMs auto-
matically detect and flag normal arithmetic errors (over-
flow, underflow, etc.) and may generate traps on occur-
rence of these errors. Programs in a HEP system are-
protected from each other and relocated in memory by a
set of relocation/protection registers in each PEM. This
allows multiprogramming in a HEP system with full isola-
tion of one user from the next.
All data and instruction words in a HEP are 64 bits long,
although PEM data memory reference instructions allow
partial word and byte addressing. The memory bandwidth
in a HEP system is 20 million words/second per PEM,
including the data switch network. Each PEM executes up
to 10 million instructions per second. The architecture of
the switch network.allows up to 128 memory modules of.
up to one million words each and up to 16 PEM's. This
range of system configurations results in speeds up to 160
million instructions per second on 64 bit data and memory
sizes up to one billion bytes.
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
HEP systems may:iridude: high speedi-real-time"I/O de--;;
vices connected to the data switch; network:; These devices
operate at memory. speeds up to 80 million bytes/second
Normal 1/0. devices~,.are connected ,to the: HEP.'system,
Process Execution
Module Structure
used sections of code. Assembly Language programs have
direct access to, all hardware capabilities, including the
direct creation and termination of arbitrary processes.
;,-through support' processors:-%Thusstandard commercial The HEP Link-Editor binds programs and subroutines into
I/O devices and. controllers may be; used for; routine I/O processes;. tasks, and jobs: The input is from either HEP
functions. All standard I/O devices are accessible through Fortran or', HEP Assembler. The output is HEP machine
the HEP operating system and Fortran executable code which is input to theHEP Loader at
execution time The HEP Link Editor runs as a user job in
ys
HEP Software the HEP PEM
.HEP systems support a. batch; operating system
with.-.'.For-;.Iran- and assembler programming languages."' The:.HEP
..operating system= provides input--and. output_ spooling,
batch job scheduling and full operator control of the. sys-
tem
~ .. i l s..t:.Ta ..
HEP Fortran. is- an: extended ANSI Fortran IV with; added.
parallel capabilities:. The Fortran programmer has, access to
all standard Fortran formatted and unformatted I/O
capabilities. In addition to the relaxation of syntax com-
mon to many Fortran compilers, HEP Fortran provides the
programmer with the means for- explicit parallel pro-
gramming. A math library is also available which generates
parallelism in the evaluation of known functions.
,
The HEP Assembly Language allows the user to access all
of the capabilities of. the .system in an efficient manner.
HEP Assembly Language subroutinesmaybe included in
a Fortran Job to improve the efficiency of certain heavily,
The HE?: File System provides a large volume, high data.
'rate I/O capability: via the HEP Switch to a HEP System
.with multiple. Process Execution Modules (PEM). Sequen-
tial access:to information stored in multiple moving-head
disk files is provided to the system at data rates from 80
megabytes, per second (the maximum input rate for the
switch),. to,- approximately 1 megabyte per second (the
rotating storage data rate). Random access to information
is provided with comparable bandwidth, depending on the
logical file size and the access patterns.
The HEP Interactive Maintenance Language (IML) pro-
vides a sophisticated yet easy-to-use language for debug-
ging the HEP System. It is used in conjunction with main-
tenance hardware, wither test slots in the HEP main frame
or off-line test fixtures.: The language is procedure- -
oriented, thus permitting complex functions to be coded
into higher order procedures.
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
genebus Element Processor
ne!ero
Uene!cor, inc-.
The. most obvious area of HEP application is the multi-
programming of ordinary SISD algorithms. This. applica-
tion does not use the inter-process communications fea-
tures of HEP, but fully utilizes its computing capacity.. Since
HEP's parallel architecture allows more complete.utiliza-
tion. of its hardware, the cost effectiveness of HEP multi-
programming is higher than for other machines of compa-
rable performance. Another benefit. of HEP's effectiveness
at conventional computation is that it can easily run all jobs
at a facility - not just those which are sufficiently large or
important to be written in parallel.
The application for which HEP was originally designed was
the solution of systems of ordinary differential equations,
such as those describing flight dynamics problems. In these
problems, a substantial system of dissimilar equations must
be solved, often in real-time. Many of the functional rela-
tionships in the equations are empirically derived and must
be repetitively evaluated by multi-dimensional interpola-
tionin lookup tables. Historically, such problems could
only be solved, with limited precision and great expense,
using analog computers. The HEP MIMD architecture. is
the first commercially available digital technology capable
of effectively addressing these problems.
Another application area well suited to HEP is the solution
of partial differential equations describing continuous
media. These equations, which occur in fluid dynamics
and heat transfer problems, are typically modelled using a
grid of lattice points within the continuous medium. The
behavior at a point is a function of the values at its
,neighbors. The HEP's architecture allows these problems
to be solved with full parallelism, even in the presence of
irregular or time-varying lattice geometry, or with complex
functional relationships between lattice points.
I'll A fourth, and very general, application area for. HEP is that
class of problems for which a large number of discrete
elements must be modeled or computed upon. Examples
of such problems are tree and table searches, multi-
particle physics. problems, electric power distribution, and
fractional distillation simulations. In all cases, complex be-
havior at a number of sites must be modeled, and interac-
tion between the sites is critical to the result Such prob-
lems are easily solved on the HER
The computing requirements for each of these applications
are different To effectively supply the range of capabilities
needed, the HEP system. is available in several configura-
tions. .
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0
PrOc1
nexus .le men t
HEP's -building block. architecture:; offers total: flexibility
enabling the user to start with the.exact amount of compu-
ter power needed. As computing requirements grow,
HEP's field-expandability allows the: user to easily ? and
economically add hardware and software modules to ac-_
commodate the largest of applications. These advanced .
features clearly place HEP in the forefront. of. digital. com-
puter technology and provide strong competition for exist-
ing, computer systems, both scalar and vector..
The evolution of HEP is a natural result of Denelcor's
on-going commitment to meet the market needs w?tS
state-of-the-art, high-quality systems.._.
s 10'Million to 160 Million Instructions per Second, Scalar
or Vector.
.2,048 to 32,768 General Purpose; 64-bit Registers
262,000 to one Billion Bytes Memory Capacity.
o Parallel Computing in Fortran
Fail-Soft Architecture
Denelcor, Inc.
3115.East 40th Avenue
-Denver, Colorado 80205
303399-5700
TWX: 910-931-2201
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0