Arch Discussion: ATTACK OF KILLER MICROS

ATTACK OF KILLER MICROS
Posts: 95

Report Abuse

Use this form to report abuse or request takedown.
The requests are usually processed within 48 hours.

Page: 1 2 3 4 5 6 7 8 9 10   Next  (First | Last)

John D. McCalpin
1989-10-14 15:07:25 EST
In article <35825@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov writes:

>*h@mips.com pointed out some important considerations in the issue
>of whether supercomputers as we know them will survive. I thought
>that I would attempt to get a discussion started. Here is a simple
>fact for the mill, related to the question of whether or not machines
>delivering the fastest performance at any price have room in the
>market.

>Fact number 1:
>The best of the microprocessors now EXCEED supercomputers for scalar
>performance and the performance of microprocessors is not yet stagnant.
>On scalar codes, commodity microprocessors ARE the fastest machines at
>any price and custom cpu architectures are doomed in this market.

>*s@maddog.llnl.gov, brooks@maddog.uucp

This much has been fairly obvious for a few years now, and was made
especially clear by the introduction of the MIPS R-3000 based machines
at about the beginning of 1989. I think that this point is irrelevant
to the more appropriate purpose of supercomputers, which is to run
long (or large), compute-intensive problems that happen to map well
onto available architectures.

Both factors (memory/time and efficiency) are important here. It is
generally not necessary to run short jobs on supercomputers, and it is
not cost-effective to run scalar jobs on vector machines. On the
other hand, I have several codes that run >100 times faster on the
ETA-10G relative to a 25 MHz MIPS R-3000. Since I need to run these
codes for hundreds of ETA-10G hours, the equivalent time on the
workstation is over one year.

The introduction of vector workstations (Ardent & Stellar) changes
these ratios substantially. The ETA-10G runs my codes only 20 times
faster than the new Ardent Titan. In this environment, the important
question is, "Can I get an average of more than 1.2 hours of
supercomputer time per day". If not, then the Ardent provides better
average wall-clock turnaround.

It seems to me that the introduction of fast scalar and vector
workstations can greatly enhance the _important_ function of
supercomputers --- which is to allow the calculation of problems that
are otherwise too big to handle. By removing scalar jobs and vector
jobs of short duration from the machine, more resources can be
allocated to the large calculations that cannot proceed elsewhere.

Enough mumbling....
--
John D. McCalpin - mccalpin@masig1.ocean.fsu.edu
m*n@scri1.scri.fsu.edu
m*n@delocn.udel.edu

B*@maddog.llnl.gov
1989-10-14 16:58:54 EST
m*h@mips.com pointed out some important considerations in the issue of whether
supercomputers as we know them will survive. I thought that I would attempt
to get a discussion started. Here is a simple fact for the mill, related to
the question of whether or not machines delivering the fastest performance
at any price have room in the market.

Fact number 1:
The best of the microprocessors now EXCEED supercomputers for scalar
performance and the performance of microprocessors is not yet stagnant.
On scalar codes, commodity microprocessors ARE the fastest machines at
any price and custom cpu architectures are doomed in this market.

b*s@maddog.llnl.gov, brooks@maddog.uucp

Robert Colwell
1989-10-14 22:12:04 EST
In article <35825@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov () writes:
>Fact number 1:
>The best of the microprocessors now EXCEED supercomputers for scalar
>performance and the performance of microprocessors is not yet stagnant.
>On scalar codes, commodity microprocessors ARE the fastest machines at
>any price and custom cpu architectures are doomed in this market.

I take my hat off to them, too, because that's no mean feat. But don't
forget that the supercomputers didn't set out to be the fastest machines
on scalar code. If they had, they'd all have data caches, non-interleaved
main memory, and no vector facilities. What the supercomputer designers
are trying to do is balance their machines to optimally execute a certain
set of programs, not the least of which are the LLL loops. In practice
this means that said machines have to do very well on vectorizable code,
while not falling down badly on the scalar stuff (lest Amdahl's law
come to call.)

So while it's ok to chortle at how the micros have caught up on the scalar
stuff, I think it would be an unwarranted extrapolation to imply that the
supers have been superseded unless you also specify the workload.
And by the way, it's the design constraints at the heavy-duty, high
parallelism, all functional-units-going-full-tilt-using-the-entire-memory-
bandwidth that make the price of the supercomputers so high, not the
constraints that predominate at the scalar end. That's why I conclude
that when the micro/workstation guys want to play in the supercomputer
sandbox they'll either have to bring their piggy banks to buy the
appropriate I/O and memory, or convince the users that they can live
without all that performance.

Bob Colwell ..!uunet!mfci!colwell
Multiflow Computer or colwell@multiflow.com
31 Business Park Dr.
Branford, CT 06405 203-488-6090

Preston Briggs
1989-10-14 22:59:35 EST
In article <35825@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov () writes:
>The best of the microprocessors now EXCEED supercomputers for scalar
>performance and the performance of microprocessors is not yet stagnant.

Is this a fair statement? I've played some with the i860 and
I can write (by hand so far) code that is pretty fast.
However, the programs where it really zooms are vectorizable.
That is, I can make this micro solve certain problems well;
but these are the same problems that a vector machines handle well.

Getting good FP performance from a micro seems to require
pipelining. Keeping the pipe(s) full seems to require a certain amount
of parallelism and regularity. Vectorizable loops work wonderfully well.

Perhaps I've misunderstood your intent, though. Perhaps you meant
that an i860 (or Mips or whatever) can outrun a Cray (or Nec or whatever)
on some programs. I guess I'm still doubtful. Do you have examples
you can tell us about?

Thanks,
Preston Briggs

Donald Lindsay
1989-10-14 23:26:48 EST
Gordon Bell, in the September CACM (p.1095) says, "By the end of
1989, the performance of the RISC, one-chip microprocessor should
surpass and remain ahead of any available minicomputer or mainframe
for nearly every significant benchmark and computational workload.
By using ECL gate arrays, it is relatively easy to build processors
that operate at 200 MHz (5 ns. clock) by 1990." (For those who don't
know, Mr. Bell has his name on the PDP-11, the VAX, and the Ardent
workstation.)

The big iron is fighting back, and that involves reducing their chip
count. Once, a big cpu took ~10^4 chips: now it's more like 10^2. I
expect it will shortly be ~10 chips. Shorter paths, you know.

I see the hot micros and the big iron meeting in the middle. What
will distinguish their processors? Mainly, there will be cheap
systems. And then, there will be expensive ones, with liquid cooling,
superdense packaging, mongo buses, bad yield, all that stuff. Even
when no multichip processors remain, there will still be $1K systems
and $10M systems. Of course, there is no chance that the $10M system
will be uniprocessor.
--
Don D.C.Lindsay Carnegie Mellon Computer Science

Eugene Brooks
1989-10-15 14:20:48 EST
In article <1081@m3.mfci.UUCP> colwell@mfci.UUCP (Robert Colwell) writes:
>So while it's ok to chortle at how the micros have caught up on the scalar
>stuff, I think it would be an unwarranted extrapolation to imply that the
>supers have been superseded unless you also specify the workload.
Microprocessor development is not ignoring vectorizable workloads. The
latest have fully pipeline floating point and are capable of pipelining
several memory accesses. As I noted, interleaving directly on the memory
chip is trivial and memory chip makers will do it soon. Micros now dominate
the performance game for scalar code and are moving on to vectorizable code.
After all, these little critters mutate and become more voracious every
6 months and vectorizable code is the only thing left for them to conquer.
No NEW technology needs to be developed, all the micro-chip and memory-chip
makers need to do is to decide to take over the supercomputer market.

They will do this with their commodity parts.


Supercomputers of the future will be scalable multiprocessors made of many
hundreds to thousands of commodity microprocessors. They will be commodity
parts because these parts will be the fastest around and they will be cheap.
These scalable machines will have hundreds of commodity disk drives ganged up
for parallel access. Commodity parts will again be used because of the
cost advantage leveraged into a scalable system using commodity parts.
The only custom logic will be the interconnect which glues the system together,
and error correcting logic which glues many disk drives together into a
reliable high performance system. The CM data vault is a very good model here.


NOTHING WILL WITHSTAND THE ATTACK OF THE KILLER MICROS!



b*s@maddog.llnl.gov, brooks@maddog.uucp

Eugene Brooks
1989-10-15 14:30:11 EST
In article <2121@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes:
>In article <35825@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov () writes:
>>The best of the microprocessors now EXCEED supercomputers for scalar
>>performance and the performance of microprocessors is not yet stagnant.
>
>Is this a fair statement? I've played some with the i860 and
Yes, in the sense that a scalar dominated program has been compiled for
the i860 with a "green" compiler, no pun intended, and the same program
was compiled with a mature optimizing compiler on the XMP, and the 40MHZ
i860 is faster for this code. Better compilers for the i860 will open
up the speed gap relative to the supercomputers.

>I can write (by hand so far) code that is pretty fast.
>However, the programs where it really zooms are vectorizable.
Yes, this micro beats the super on scalar code, and is not too sloppy
for hand written code which exploits its cache and pipes well. The
compilers are not there yet for the vectorizable stuff on the i860.
Even if there were good compilers, the scalar-vector speed differential
is not as great on the i860 as it is on a supercomputer. Of course,
interleaved memory chips will arrive and microprocessors will use them.
Eventually the high performance micros will take the speed prize for
vectorizable code as well, but this will require another few years of
development.


b*s@maddog.llnl.gov, brooks@maddog.uucp

Eugene Brooks
1989-10-15 14:39:09 EST
In article <6523@pt.cs.cmu.edu> lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) writes:
>Gordon Bell, in the September CACM (p.1095) says, "By the end of
>1989, the performance of the RISC, one-chip microprocessor should
>surpass and remain ahead of any available minicomputer or mainframe
>for nearly every significant benchmark and computational workload.
It has already happened for SOME workloads, those which hit cache well
and are scalar dominated. This was done without ECL parts. The ECL
parts will only make matters worse for custom processors, as Bell indicates,
dominating performance for all workloads.

>I see the hot micros and the big iron meeting in the middle. What
>will distinguish their processors?
Nothing.

>Mainly, there will be cheap
>systems. And then, there will be expensive ones, with liquid cooling,
>superdense packaging, mongo buses, bad yield, all that stuff. Even
>when no multichip processors remain, there will still be $1K systems
>and $10M systems. Of course, there is no chance that the $10M system
>will be uniprocessor.
The $10M systems will be scalable systems built out of the same microprocessor.
These systems will probably be based on coherent caches, the micros having
respectable on chip caches which stay in sync with very large off chip
caches. The off chip caches are kept coherent through scalable networks.
The "custom" value added part of the machine for the supercomputer vendor
to design is the interconnect and the I-O system. The supercomputer vendor
will still have a cooling problem on his hands because of the density of
heat sources in such a machine.


b*s@maddog.llnl.gov, brooks@maddog.uucp

Mike Haertel
1989-10-15 15:24:01 EST
In article <1081@m3.mfci.UUCP> colwell@mfci.UUCP (Robert Colwell) writes:
>I take my hat off to them, too, because that's no mean feat. But don't
>forget that the supercomputers didn't set out to be the fastest machines
>on scalar code. If they had, they'd all have data caches, non-interleaved
>main memory, and no vector facilities. What the supercomputer designers

Excuse me, non-interleaved main memory? I've always assumed that
interleaved memory could help scalar code too. After all, instruction
fetch tends to take place from successive addresses. Of course if
main memory is very fast there is no point to interleaving it, but
if all you've got is drams with slow cycle times, I would expect
that interleaving them would benefit even straight scalar code.
--
Mike Haertel <mike@stolaf.edu>
``There's nothing remarkable about it. All one has to do is hit the right
keys at the right time and the instrument plays itself.'' -- J. S. Bach

Eric S. Raymond
1989-10-15 20:29:13 EST
In <35825@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov wrote:
> The best of the microprocessors now EXCEED supercomputers for scalar
> performance and the performance of microprocessors is not yet stagnant.
> On scalar codes, commodity microprocessors ARE the fastest machines at
> any price and custom cpu architectures are doomed in this market.

Yes. And though this is a recent development, an unprejudiced observer could
have seen it coming for several years. I did, and had the temerity to say so
in print way back in 1986. My reasoning then is still relevant; *speed goes
where the volume market is*, because that's where the incentive and development
money to get the last mw-sec out of available fabrication technology is
concentrated.

Notice that nobody talks about GaAs technology for general-purpose processors
any more? Or dedicated Lisp machines? Both of these got overhauled by silicon
microprocessors because commodity chipmakers could amortize their development
costs over such a huge base that it became economical to push silicon to
densities nobody thought it could attain.

You heard it here first:

The supercomputer crowd is going to get its lunch eaten the same way. They're
going to keep sinking R&D funds into architectural fads, exotic materials,
and the quest for ever more ethereal heights of floating point performance.
They'll have a lot of fun and generate a bunch of sexy research papers.

Then one morning they're going to wake up and discover that the commodity
silicon guys, creeping in their petty pace from day to day, have somehow
managed to get better real-world performance out of their little boxes. And
supercomputers won't have a separate niche market anymore. And the
supercomputer companies will go the way of LMI, taking a bunch of unhappy
investors with them. La di da.

Trust me. I've seen it happen before...
--
Eric S. Raymond = eric@snark.uu.net (mad mastermind of TMN-Netnews)
Page: 1 2 3 4 5 6 7 8 9 10   Next  (First | Last)


2020 - UsenetArchives.com | Contact Us | Privacy | Stats | Site Search
Become our Patron