LETTER TO WILLIAM CASEY FROM GORDON BELL
Document Type:
Collection:
Document Number (FOIA) /ESDN (CREST):
CIA-RDP88G01332R000901070030-2
Release Decision:
RIPPUB
Original Classification:
K
Document Page Count:
15
Document Creation Date:
December 27, 2016
Document Release Date:
January 13, 2012
Sequence Number:
30
Case Number:
Publication Date:
October 10, 1986
Content Type:
LETTER
File:
Attachment | Size |
---|---|
CIA-RDP88G01332R000901070030-2.pdf | 795.84 KB |
Body:
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88G01332R000901070030-2
1:
EXA/DDA
DDA REGISTRY
FIN
Note and aw
ma
For CNaanm
Frr O_Nwsaftn
for Co metlon
For Your Infosrnatisn
Sae Me
Gatlon
Justi
ROUTING AND TRANSMITTAL SLIT D
,20 OCT 1986
^N? rf numb o.
D/OIT RECEIVED A COPY.
DO NOT use this form as a RECORD of a ?~N, comurre :oos, dhpommk
oNeanoes, and sMnllsr a Mum
FNOM: (Hams. o& srmboi. AEenc-/Pogt) Roan No.-amw-
IsiWb
STAT
Par, CKMM-,LM
:. (1986)
(1986 Dec
SGI Ir 3030
Ho DN
580
986) 1..
w/F
(19
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88G01332R000901070030-2
Background
The current surge of interest in supercomputers becomes clear when we look at the evolution from
the late 70's when the Cray 1 and VAX 780 were the standards for computation. The 780 entered
the scientific and engineering community because it provided relatively the same price performance
as a Cray 1, even though the performance differed by a factor of 80 (using Linpack as an
indicator). A more reasonable estimate for the difference is more like a factor of 20-40. Those
who bought VAXen observed that since the average user only got 1-2 hours of Cray time each
week, (50-100 hours per year) they could get the same amount of computing done by letting a
VAX grind 20-160 hours per week.
Over time, the Cray evolved; the XMP was speeded up by over a factor of two and built as a
multiprocessor, which roughly trebled the performancelprice. When the scientific community
started utilizing Crays with improved compilers, they began to develop more effective algorithms
for vectors that increased the effective power of the machines. The delay in getting a more
cost-effective VAX (the 8600 was two years late),-and the relatively high price of VAXen
exacerbated the difference between the supercomputer, and the super-minicomputer (in essence a
lower priced mainframe). The popularity of VAXen for more general computing also allowed the
price to remain high, by giving it a market outside the research community. DEC, like IBM when it
introduced a complete range of compatible computers, may have become less interested in and
attentive to the research community. The Cray/VAX gap may have been a major motivation in the
formation of the NSF Advanced Scientifc Computing Program.
In the early 80's Alliant, Convex, and Scientific Computer Systems formed to exploit the
performance/price gap between the Cray XMP and VAX by utilizing vector data-types pioneered in
the Cray 1. Thus, a new class of mini-supercomputers was formed, all of which have better
performance/price than the Cray (almost a factor of 2 in the case of the new SCS-40).
By 1985, ten years after the Cray 1, IBM and Japanese manufacturers building IBM-compatible
mainframes had added vectors and multi-processors to their machines.
Observations About the Computers From the Table
Three characteristics are important: the processing power in Megaflops; the cost-effectiveness in
flops/$, and the stretch time versus a Cray. There are exceptional computers, when comparing the
cost-effectiveness in each class: the (projected) ETA-10 (to be better by a factor of 8!), and the
SCS-40 (better by almost a factor of 2). The SCS-40's virtue and principle flaw is Cray
compatibility. Other mini-supers have virtual memory. A cluster of SUN workstations could
provide up to a factor of 2 better performance/price, depending on the amount of secondary
memory. The factor of 5 difference in the speed of the ETA-10 versus a Cray XMP should open
up new problem solution domains. The ETA-10 uses large CMOS gate arrays on large, multilayer
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88G01332R000901070030-2
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88GO1332R000901070030-2
printed circuit boards. This kind of fabrication provides a potential breakthrough in cost that is
counter to the use of ECL to build supercomputers, large mainframes, and superminicomputers by
Cray, DEC, IBM and the Japanese.Both the Cray 2 and ETA-10 have large memories that should
open up new problem domains. All of the machines, except the Crays, have virtual memory.
Because of the lack of paging, it may be difficult for multiple users with very large problems to
effectively utilize the Cray 2. The use of large physical and virtual memories needs to be explored
and understood.
While the table shows times for a floating-point intense program, Linpack, it is unclear how the
machines perform under comparable workloads or whether they will actually be used in the same
fashion. For example, a slower machine is likely to be used more interactively and results of the
computation viewed constantly to avoid unnecessary work. Users of large batch machines may
have to request more work and output because turn-around is longer. Scalar benchmarks aren't
given, and most machines are used a significant amount of time either interactively or in scalar
mode, both of which lower the performance and favor the 3090 (which outperforms the Crays in
scalar mode), mini-supers, and workstations.
NEC's SX-2, not included in the Table, executes Linpack at about twice the performance of a
single processor Cray XMP. The performance/price is unclear.
Many computers exhibit performance/price comparable to today's supercomputers. The Advanced
Scientific Computing Program must understand the relative power and work capacity of all forms
of computation and begin to develop was to s apply resources appropriate to user need and
cost-effectiveness considerations.
Can Users Tolerate the Time Stretch/ Lower Cost Trade-off?
Can a user of a smaller computer, stand the lengthened turn-around time that comes with using a
slower computer and stretching the computation time by factors of 4 to 10?. At present, only one or
two users within our user community are receiving an hour of computer time per day. The
mini-supercomputers, supplying the eqivalent of one hour of Cray time in 4-10 hours are
competitive because the average turn-around for a one-hour job on a Cray can easily be this long.
The typical turn-around for a 15 minute job is 2 hours (or factor of 8 stretch). The Sun
Workstation might be used for longer computation provided the user "guides" the computation.
The Sun's stretch factor is comparable to that experienced between the Cray and 780 during the late
70's. Alternatively, advances in partitioning programs for parallel processing make the cluster have
the best perfommance/price if a job can be parallelized using a message-passing model of
computation.
Based on the performance, and time allocations inherent in supercomputer use, a complete
4
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88GO1332R000901070030-2
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88GO1332R000901070030-2
hierarchy of computers will exist and is justified. Given that an individual user or eject is 11y
to Sim access all levels of the hi rchy. a compatible (and most likely standardized)
basic environment that c n support user communities who ? turn ? common applications
envubnments is essential,
Multiprocessors, Array processors and Multicomputers (e.g. Hypercubes) for
Parallel Processing
A number of alternatives exist that may offer significant improvements in performance or
performance/price. For example, a 64 computer NCUBE has been used to solve a problem that
took twice as long on a single processor XMP. The improvement yielded almost an order of
magnitude in cost. Given the decomposition for parallel procesing on the NCUBE, an XMP might
be used to gain a 4 times speed-up; in fact, the XMP operating in this mode has computed Unpack
at a rate of 713 Mflops which is 26 times the single processor rate. Likewise, array processors
such as the FPS X64 have been lashed to minis and mainframes, yielding significant improvements
in performance/price. None of these alternatives are explored.
Stand ardi .e~ parallel processing primitives in all programming liar guages based on a m ulti process.
messa -1 _,C omputation is needed for all Structures Programs user in hi rashio_n
gate compatibly and identically across workstation clusters. mult,c.,.t puters such the
I:I!IIIIIIIIi shared-memory multiprocessors (e.g. y and ETA) given the relativesy
constant perform nc e/ ce and similar turn arround times for all of the computing alternatives
pararalllelprocessing becomes essential,
The Role of the Super Computer Centers
Historically, centers have existed for a variety of reasons including cost sharing, technology,
performance, networking, user needs, local politics, government funding, etc. Clearly when hot
ideas emerge and projects need ten to several hundred hours of supercomputer time that can't be
supplied locally the centers are essential. The definition of the kinds of work that 1t raters will
support is critical given that computation can be done very eff y by local university centers.
departments, projects and individuals at workstations,
Our centers are critical to scientific and en in ring computing for the research communi Today
the centers train users about the parallelism inherent with vector data-types. They have the
programs and staff to train the trainers and users rapidly, and to support large programs and
datasets inherent in supercomputer use. Centers may be the best place to support certain large
programs and databases for a given intellectual community; NCAR is an excellent example as it
provides millions of lines of common programs and 17 terabits of common data for its community
of atmospheric scientists to environmental engineers. Centers may also support common programs
for communities of distributed users at mini-supers, super-minis, and workstations in order to
5
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88GO1332R000901070030-2
. ........... ...... .
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88GO1332R000901070030-2
supply service when the distributed research requires significant computing power.
Large amounts of power (on the order of 1 hour per day) would be supplied to large projects that
do not have machines, and to a community of student and casual users who access common
programs and data. If the "average" project uses 1 hour per day or 350 hours per year, then a Cray
XMP would support 24x4, or about 100 projects! Projects of this size would be, in effect,
subsidized at about $100,000 with steady-state costs. It can, alternatively, service 640 users who
use at most an hour a week, or 50 hours per year, providing them about a $15,000 subsidy.
Finally, several thousand student and casual users who would use no more than 10 hours per year
(a year on a PC/370) could be supported at negligible cost. Policy statements are needed which
racterize useag across geography. usM9 and discipline.
The centers have a lead role in supporting state-of-the-art computers of all types including
supercomputers, mini-supercomputers, and larger scale experimental machines. The centers
should be the beta test sites of all new systems, especially those which can not be easily purchased
or supported by local researchers or departments. The centers mus ke the lead role in
understanding benchm rks workloads and cost-e ectiv of all Toren of c
Rte.
Standards. The three alternative forms of computation that form the main line of computing all
provide roughly the same computational service at comparable costs (not including the cost to the
user). We must establish standard-s. that make it e,ually easy for users to work at any of the nla c
in a compatible fashion. In many cases, a user will use the super or mini-super or existing
super-mini for calculations and the workstation to view results. Thus code will be run in a highly
distributed fashion across different machines including new, and evolving UNIX-compatible PC's.
it a should work toward establishing and supporting common
programs and data across
engineering and scientific disciplines so they may compute any level of the hiera_rchv
6
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88GO1332R000901070030-2
............. .............. .
? Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88GO1332R000901070030-2
Conclusions
Computers now exist which allow various styles of computing ranging from regional
supercomputers to personal workstations. All of the computers in the hierarchy will continue to
exist and flourish because, with the exception of the ETA 10 to be delivered next year, all offer
relatively the same cost and effectiveness.
Having the wide range of styles and locations demands attention to:
-training, education and program support;
?networks for intercommunication of programs, data, and terminal access;
-benchmarks, workloads, accounting, and pricing i.e. understanding cost and effectivenss;
-allocation of time across user communities by size, discipline, and geography;
*standardized programming environments and graphics enabling effective use;
-supporting specialized community programs (e.g. NASTRAN) and databases
(e.g.NCAR);
-specialized and alternative computers; and
-standards, understanding and training for compatible, message-passing parallel processing.
With the center program entering phase II, attention and resources will have to be focused on these
demands.
7
Declassified in Part - Sanitized Copy Approved for Release 2012/01/13: CIA-RDP88GO1332R000901070030-2