HETEROGENEOUS ELEMENT PROCESSOR

Document Type: 
Collection: 
Document Number (FOIA) /ESDN (CREST): 
CIA-RDP86B00689R000300040031-0
Release Decision: 
RIFPUB
Original Classification: 
K
Document Page Count: 
7
Document Creation Date: 
December 20, 2016
Document Release Date: 
July 17, 2007
Sequence Number: 
31
Case Number: 
Publication Date: 
March 1, 1981
Content Type: 
OPEN SOURCE
File: 
AttachmentSize
PDF icon CIA-RDP86B00689R000300040031-0.pdf1.25 MB
Body: 
Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 Tomorrows Computer-.: Is Here., Today Denelcor's Heterogeneous, Element Processor (HEP) is a large-scale (64-bit) high- speed digital..- computer whose architecture makes all other supercomputer ar- chitecture obsolete. HEP provides a totally new com puting environment:.. high-... speed, parallel processing of heterogeneous- data ele- ments. HEP has, been de- signed for use~ in scientific and/or commercial applica-. tions which can. effectively utilize processing speeds of ten million to 160 million in-.- structions per second. . HEP': ..achieves this throughput be- cause of its design which im- - plements the. Multiple In- struction Stream Multiple chitectural concept for the .:;:_ fiat time in a. commercially user up to 1,024 indepen dent instruction streams or pro- ;- cesses, each with. its own data-stream to,.. be used concurrently for uses in pro- gramming applications. This multiplicity of instruction streams running in parallel enables and encourages breaking, the application into its component parts for parallel processing. Other features. of the HEP design. provide the synchronization necessary to facilitate cooperation between concurrent processes, and eliminates the precedence delays which often occur when parallel processing is attempted using more conventional data processing equipment An equal number. of Super- visory Processes are available for processing the privileged functions necessary to the support of the User Processes for a total of 2,048 independent instruction streams. ro'd' e SSoT The many capabilities of the:; HEPhardware.. are _fully supported' by. HEP system , Software so that. the- poten :;, tial:.performance of the.sys- tem.. is. realized:with relative-;; ease.,. Using the--available :System Software,:. pro- gramming HEP,is verysimi "-_ lar to. programming a con- ,, ventional. system,., and: only: minimal: additional: pro-? gram mer . training is ~ , re quired _ In addition to the obvious:: design goals. :of . fast throughput and the ability. to:-..--. solve very -large and. com plea:problems, HEP is, de- signed for ease of. operation and to be.: highly effective of= - across the full range general-purpose. computing :". applications HEPhardware is modular and field expandable_ HEP achieves its high speed performance through ad-: vaned architectural-.con cepts rather. than: through un-'-' proven "leading edge technology" elec tronic components.. This provides the user :benefits in economy and reliability- HEP Parallel Fortran is designed for maximum similarity to -: existing languages, with logical. extensions as necessary to- implement the advanced features of HEP.. HEP is designed for ease of maintenance in the event of hardware malfunction. Maintainability features are an in- tegral part of the hardware design, including an on-board maintenance diagnostic system which implements an In- teractive Maintenance Language for diagnostic purposes. Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 Evolution of Computer Architecture The earliest computers executed a single instruction at a time, using a single piece of data:. The architecture of these machines, called SISD (for Single Instruction,. Single Data: Stream) computers, was straight-forward, and well suited. to the technology of the times.. As technology advanced and computer users required greater performance, SISD machines were made faster and faster, using newer and Another approach to increasing the speed of computation I- ,- to make multiple copies of portions of the SISD hardware. In this approach, called SIMD (for Single In.. struction, Multiple Data Stream), the operand fetch,: execution and result store portions of the hardware. were replicated, so that the execution of a single instruction caused several values to be fetched, computed upon and better components and designs. But a fundamental prob- the answers stored. For certain.problems, this provided a lem remained.. Although the execution of a. computer in- substantial performance improvement.. With sufficient., struction is physically composed of several parts - instruc- hardware, entire- vectors of numbers could be operated tion fetch, operand fetch, execution and result store - the upon simultaneously. However, as with "look-ahead" SISD computer could only perform one of these at a time, SISD machines,. the occurence of test and branch instruc since each step depended on the completion of the previ- lions, among others, required the machine to wait for the- ous one. Thus, three-fourths of the expensive. hardware total completion of the instruction before proceeding..The stood idle at any given time, waiting for the rest of the- test and branch itself could. make no use of the replicated hardware to finish operation. hardware. Divide SISD Single Instruction, Single Data Stream SISD designers attempted to remedy this by a technique called "look-ahead", in which instruction fetch for the next instruction was overlapped with some portion of the execution of the current instruction. This provided some performance improvement. However, digital computer programs, particularly those written in higher level lan- guages, contain large numbers of test and branch instruc- tions, in which the choice of the next instruction depends on the outcome of the current instruction. In such cases, "look-ahead" offers no speedup, and introduces substan- tial complexity to make sure that the partial execution of an incorrect next instruction does not contaminate the computation. In addition, two new problems were created by the SIMD architecture. Substantial portions of most programs are not vector-oriented. The computation of iteration variables and array subscripts is a scalar problem, for which SIMD offers no speedup, and the collection of operands across arrays is an addressing problem which many SIMD architectures do not handle. As a. second problem, if an SIMD computer has a fixed quantity of replicated execu- lion modules (adders, etc.), and if the length of the vector, which the user wishes to operate on differs from the- vector length of the machine, performance suffers and software complexity increases. The cost of computation remains high since the hardware is often not fully utilized. SIMD Single Instruction, Multiple Data Strea Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 t. - Den icot ':? c. . Evolution of Computer Architecture- Continued difficulties with the implementation of high per- formance, cost effective computation using single. instruc- tion machines have led to the development of a new concept in computer architecture. This concept, called MIMD (for Multiple Instruction,', Data Stream) architecture, achieves high performance at low hardware cost by keeping all processor hardware .utilized executing multiple parallel programs simulta- neously. For example, while an add is in progress for one process, a multiply may be executing for another, a divide for a third; or similar functions may be executing simulta- neously, such as multiple adds or divides- In MIMD ar- chitectures, cooperating programs are often called "pro- cesses". Independent. programs may contain one or sev- eral processes. processes. Since this arbitration of the state of memory locations is handled by hardware and without affecting the execution of unrelated instructions, the communication delay. is short and the overhead is small. MIMD computers may, be used to execute either SISD or SIMD programs. SISD programs are just MIMD programs with no inter-program communication. Execution of mul- tiple identical MIMD programs is equivalent to execution of an SIMD program. In the SIMD case, MIMD computers may match the vector lengths exactly, while using remaining resources for unre- lated computation. Thus, high efficiency may .be main- tained..even'through scalar portions of the code. But the major application of MIMD computers lies in problems of Multiply add Multiply. Because the multiple instructions executed concurrently by an MIMD machine are independent of each other, execution of one instruction does not influence the execu- tion of other instructions and processing may be fully parallel at all times. Successful MIMD architectures (figure 3) also provide low-overhead mechanisms for inter-process communica- tion. In these architectures, data locations may contain not only a value but a state. Processes may synchronize by waiting for input locations to have the "full" state. Result storage may wait for output locations to attain the "empty" state resulting from consumption of their contents by other ;ruction, a Stream Result L E+F sufficient complexity that straightforward vector computa- tion is not feasible. In these cases, which include continu- ous simulation and complicated partial differential equa- tion solutions, MIMD architecture offers the only possible method of achieving significant parallelism. Denelcor's Heterogeneous Element Processor system is the only commercially available MIMD computer. Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 I 4'iUr"'-1P t :P'- r-Ps nr Deneicor; Inca. HEP Architecture Diagnostic Maintenance, Sub-System-.:: ,.-. High-Speed File: The HEP computer system consists of process execution modules (PEMs), data memory modules and support pro- cessors interconnected by a high-speed data switch net- work. All data memory modules are accessible by all PEMs. Thus, processes executing in parallel in one or several PEMs may cooperate,. by reading and writing shared information in the data memories. Parallel pro- cesses synchronize and pass information back and forth using the full/empty attribute of each data memory loca- tion. HEP instructions may automatically wait for an input data memory location to be full. before execution, and leave the. location empty after execution. Instructions may also wait for an output location to be empty before execu- tion and leave it full after. execution-This communications discipline allows processes to conveniently and unambigu- ously pass information to other processes while executing. The lull/empty attribute ensures that reads and writes of inter-process variables will alternate and no information will be lost. For locations used exclusively within a process, the full/empty attribute is ignored and memory may be ac- cessed conventionally. Both normal and synchronized memory access are avail- able to the Fortran programmer as well as the assembly programmer.. Software modules in both Fortran and as- sembler programs.may be distributed across several PEMs to achieve increased throughput In general, design of a parallel program is not affected by whether the program will run in one or several PEMs. HEP System Structure ------------- I/O -J9Interfaces i r-;-- Cache Peripherals i ? ~__~_-~ a-------------F4?-~ Sub-Systems i External 1/0 DAC ADC Clocks Discrete l..'O In HEP, creation and termination of parallel processes in an MIMD program is a hardware capability directly avail- able to the programmer. Processes are created or termi- nated in 100 nanoseconds by execution of a single HEP instruction. Thus, processes may be created at any point in a, program where additional parallelism is required, and. terminated as soon as their function is accomplished. Up ta. 64 user processes may exist simultaneously within each PEM in a HEP system. In order to efficiently manipulate data, each PEM contains 2048 internal general purpose registers. PEMs auto- matically detect and flag normal arithmetic errors (over- flow, underflow, etc.) and may generate traps on occur- rence of these errors. Programs in a HEP system are- protected from each other and relocated in memory by a set of relocation/protection registers in each PEM. This allows multiprogramming in a HEP system with full isola- tion of one user from the next. All data and instruction words in a HEP are 64 bits long, although PEM data memory reference instructions allow partial word and byte addressing. The memory bandwidth in a HEP system is 20 million words/second per PEM, including the data switch network. Each PEM executes up to 10 million instructions per second. The architecture of the switch network.allows up to 128 memory modules of. up to one million words each and up to 16 PEM's. This range of system configurations results in speeds up to 160 million instructions per second on 64 bit data and memory sizes up to one billion bytes. Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 HEP systems may:iridude: high speedi-real-time"I/O de--;; vices connected to the data switch; network:; These devices operate at memory. speeds up to 80 million bytes/second Normal 1/0. devices~,.are connected ,to the: HEP.'system, Process Execution Module Structure used sections of code. Assembly Language programs have direct access to, all hardware capabilities, including the direct creation and termination of arbitrary processes. ;,-through support' processors:-%Thusstandard commercial The HEP Link-Editor binds programs and subroutines into I/O devices and. controllers may be; used for; routine I/O processes;. tasks, and jobs: The input is from either HEP functions. All standard I/O devices are accessible through Fortran or', HEP Assembler. The output is HEP machine the HEP operating system and Fortran executable code which is input to theHEP Loader at execution time The HEP Link Editor runs as a user job in ys HEP Software the HEP PEM .HEP systems support a. batch; operating system with.-.'.For-;.Iran- and assembler programming languages."' The:.HEP ..operating system= provides input--and. output_ spooling, batch job scheduling and full operator control of the. sys- tem ~ .. i l s..t:.Ta .. HEP Fortran. is- an: extended ANSI Fortran IV with; added. parallel capabilities:. The Fortran programmer has, access to all standard Fortran formatted and unformatted I/O capabilities. In addition to the relaxation of syntax com- mon to many Fortran compilers, HEP Fortran provides the programmer with the means for- explicit parallel pro- gramming. A math library is also available which generates parallelism in the evaluation of known functions. , The HEP Assembly Language allows the user to access all of the capabilities of. the .system in an efficient manner. HEP Assembly Language subroutinesmaybe included in a Fortran Job to improve the efficiency of certain heavily, The HE?: File System provides a large volume, high data. 'rate I/O capability: via the HEP Switch to a HEP System .with multiple. Process Execution Modules (PEM). Sequen- tial access:to information stored in multiple moving-head disk files is provided to the system at data rates from 80 megabytes, per second (the maximum input rate for the switch),. to,- approximately 1 megabyte per second (the rotating storage data rate). Random access to information is provided with comparable bandwidth, depending on the logical file size and the access patterns. The HEP Interactive Maintenance Language (IML) pro- vides a sophisticated yet easy-to-use language for debug- ging the HEP System. It is used in conjunction with main- tenance hardware, wither test slots in the HEP main frame or off-line test fixtures.: The language is procedure- - oriented, thus permitting complex functions to be coded into higher order procedures. Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 genebus Element Processor ne!ero Uene!cor, inc-. The. most obvious area of HEP application is the multi- programming of ordinary SISD algorithms. This. applica- tion does not use the inter-process communications fea- tures of HEP, but fully utilizes its computing capacity.. Since HEP's parallel architecture allows more complete.utiliza- tion. of its hardware, the cost effectiveness of HEP multi- programming is higher than for other machines of compa- rable performance. Another benefit. of HEP's effectiveness at conventional computation is that it can easily run all jobs at a facility - not just those which are sufficiently large or important to be written in parallel. The application for which HEP was originally designed was the solution of systems of ordinary differential equations, such as those describing flight dynamics problems. In these problems, a substantial system of dissimilar equations must be solved, often in real-time. Many of the functional rela- tionships in the equations are empirically derived and must be repetitively evaluated by multi-dimensional interpola- tionin lookup tables. Historically, such problems could only be solved, with limited precision and great expense, using analog computers. The HEP MIMD architecture. is the first commercially available digital technology capable of effectively addressing these problems. Another application area well suited to HEP is the solution of partial differential equations describing continuous media. These equations, which occur in fluid dynamics and heat transfer problems, are typically modelled using a grid of lattice points within the continuous medium. The behavior at a point is a function of the values at its ,neighbors. The HEP's architecture allows these problems to be solved with full parallelism, even in the presence of irregular or time-varying lattice geometry, or with complex functional relationships between lattice points. I'll A fourth, and very general, application area for. HEP is that class of problems for which a large number of discrete elements must be modeled or computed upon. Examples of such problems are tree and table searches, multi- particle physics. problems, electric power distribution, and fractional distillation simulations. In all cases, complex be- havior at a number of sites must be modeled, and interac- tion between the sites is critical to the result Such prob- lems are easily solved on the HER The computing requirements for each of these applications are different To effectively supply the range of capabilities needed, the HEP system. is available in several configura- tions. . Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0 PrOc1 nexus .le men t HEP's -building block. architecture:; offers total: flexibility enabling the user to start with the.exact amount of compu- ter power needed. As computing requirements grow, HEP's field-expandability allows the: user to easily ? and economically add hardware and software modules to ac-_ commodate the largest of applications. These advanced . features clearly place HEP in the forefront. of. digital. com- puter technology and provide strong competition for exist- ing, computer systems, both scalar and vector.. The evolution of HEP is a natural result of Denelcor's on-going commitment to meet the market needs w?tS state-of-the-art, high-quality systems.._. s 10'Million to 160 Million Instructions per Second, Scalar or Vector. .2,048 to 32,768 General Purpose; 64-bit Registers 262,000 to one Billion Bytes Memory Capacity. o Parallel Computing in Fortran Fail-Soft Architecture Denelcor, Inc. 3115.East 40th Avenue -Denver, Colorado 80205 303399-5700 TWX: 910-931-2201 Approved For Release 2007/07/17: CIA-RDP86B00689R000300040031-0