Nr9 Prophecy being fullfilled?

amorph · October 30, 2004 10:47AM

Hey, is it my fault the horse has a sign on it that says "Kick me!"?

Quote:

Originally posted by Programmer

Amorph, whether his machine is blindingly fast for the tasks he uses it isn't the issue... it is that he generalizes that into stating that current computers are fast enough. That's the kind of attitude that would still have us picking berries while watching cautiously for roaming lions.

I'll buy that, with the caveat that most of them are fast enough for what's currently asked of them, and the need for speed comes from the desire to ask more of them.

Quote:

Other alternatives exist, however, and since the "easy" scaling has run into trouble I think we'll see more investment in them. Asynchronous designs (Sun is doing work there), multiple clock designs (a al P4's double speed integer units), or specialized deep pipeline vector ALU cores (IBM/Sony's Cell) will all have parts running at higher speeds... but its going to be a different ballgame than the exponential clockrate march we've been on until now.

This is true, and Dave, when I said that clockspeed was dead as a scaling force, I didn't mean that no part of any CPU would ever clock higher. I was talking about the recently ended practice of scaling a large core higher. That's over, as far as anyone can tell. If you let the fastest logic in a current CPU run as fast as it can, you'll see a significant clockspeed boost for that logic. But you can't take that step without contemplating new and significantly different designs.

imiloa · October 30, 2004 2:03PM

Quote:

Originally posted by wizard69

Well at least you are reasonable and are not chanting that all clock rate increases are forever dead. This is what I have problems with. People see IBM and intel having problems and all of a sudden the whole industry is collared with the same problems.

my 2 cents: asserting clock rate increases are dead is precarious, vis programmer's adept berries/lions analogy.

yes, silicon chips are bumping up against some fundamental physics, and yes, the industry has clearly hit some unexpected walls recently. i'm not an EE, but given the amount of money the various players pump into R&D and the deep experience of their teams, it seems clear these walls are difficult hurdles.

that said, human technology has stalled at countless hurdles over time, and eventually radical new discoveries, ideas, and approaches tend to find a path around or tunnel under the walls. sometimes a breakthrough opens a new dimension of movement in which the wall isn't even a factor.

penicillin is a good example. before 1928, who would have thought that mold would produce a powerful anti-bacterial agent? in fact, penicillin was discovered by accident when fleming found some bacteria samples, accidentally contaminated with mold, were being killed by the mold.

quick google link for ref

and many discoveries arise from experiments that at first seem to fail. eg: rutherford's gold foil atom discovery. rutherford set up the experiment to prove that matter was effectively empty space, that alpha particles would pass through the foil effortlessly. instead, he found that 2% were deflected, and a tiny fraction bounced back.

vis one of my favorite quotes:

The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' (I found it!) but 'That's funny ...'

- Asimov

nutshell is that while the industry has hit some bumps, new paths to speed will no doubt be found. that's why i posted the "graphene" thread. still too early to tell if that tech will be useful. but a 1-2 atom thick conductive film could be a godsend in terms of heat.

and it's only logical to anticipate breakthroughs beyond silicon hardware. eg: optical and quantum computing.

in the meantime, a chunk of the R&D money that bounces off the walls will flow laterally into new paradigms of abstract circuit logic, as programmer notes.

and this is a net good for all. ie: better designs soon, then later at higher speeds. vis amorph's adept note that the G4 wall at 500 mhz induced mac programmers to focus more effort on optimizing their code and leveraging altivec. once G4 scaled above 500 mhz, those improvements scaled with it.

imiloa · October 30, 2004 2:55PM

btw, in the general vein of professionals needing ever faster machines:

<deadHorseBeating>

take out a stop watch and time how long it takes to launch your most complex app on the fastest machine available. actually, MSFT office is a good candidate.

when that app launches faster than i can move the mouse to select the first file menu option, then a milestone will be reached...

immediately followed by a new rev of app with more features, requiring twice the load time.

if this sound petty, imagine that it's 9:42am and you're giving a product presentation at 10:00am. the client balked on the initial UI designs, and the artists cranked all night and just dumped the new graphics in your FTP folder.

you have 18 minutes to integrate them, test the presentation, reedit, retest, repeat, repeat, and get to the conference room without sweating on the way. in this scenario, small delays can be maddening. eg: waiting for an app to launch, edit, save, etc...

of course, this would never happen in real life. funny how it does seem to happen all the time.

replace presentation graphics integration with:

- debug cycles:

1) edit code.

2) compile.

3) run test.

4) goto 1, ad nauseum

- music compression. eg: to MP3.

- video compression.

etc...

in the hardcore pro arena, consider:

- real-time video montage. eg: monday night football highlight reels, thrown together while the game plays, complete with special effects and complex transitions. i don't know what hardware they use for this, but no reason it couldn't be done on a sufficiently fast mac/PC.

</deadHorseBeating>

programmer · November 1, 2004 9:49AM

Quote:

Originally posted by wizard69

Yes the reality of having the extra area to implement another full 64 bit processor is a great advantage of the current process sizes. It is an excellent trade off for certain systems. The problem is that at a given feature size the trade off is a one time deal. Once you filled up the die to maximum size, based on the economics of production, the only recourse you have left is to go back to enhancing the cores themselves.

Interesting way of looking at it, but adding more transistors is always easier than making faster ones... and adding more transistors in the form of whole additional cores is the easiest way to use them (except perhaps for larger on-chip memory arrays). Even scaling cache introduces tricky issues of latency, associativity, etc.

imiloa: the load time issue is an interesting one. Faster hardware helps, but really the most effective way to eliminate load times is rooted in software. The hardware guys have to work really hard to double their performance, and that generally improves load times by less than half. The software guys could almost eliminated perceived load times on the existing hardware, if that was a high enough priority.

Your other examples can benefit from process optimization but generally really do need the brute force of hardware to make big inroads. Of the three you list, the compile/link/test cycle is the one that is most able to be accelerated by fixing the software. 10-15 years ago we had IDEs that gave faster cycle times than the current ones... on hardware that was 10-100 times slower. Now we have more features and the same or slower cycle times. Frustrating...

9secondko · November 1, 2004 10:32AM

Faster hardware is always needed to drive software more productively. In essence, faster hardware= faster software.

Faster hardware also creates room for more powerful software to run at an acceptable performance level. Faster hardware allows for greater software innovations and features

Tighter code functions more EFFICIENTLY (thus executing faster), but takes a while to optimize-which the industry in general does to some extent, but not to near-perfection (especially MS with their bloat).

There will always be a need for faster hardware just as there will always be a need for optimized code.

I remember a while back, Bill Gates thought that no one would need a multimegabyte hard drive...

rhumgod · November 1, 2004 11:13AM

Faster hardware? Of course, it's not like the industry will stand still. Faster does not always mean higher clock speeds, however. Innovation will come in different forms. Optical switching perhaps, instead of copper-based switching (transistors). That would eliminate a lot of heat-generation issues. It's only a matter of time until companies engineer themselves out of the current predicament. And why Nr9's prophecy is a rather near-sighted statement.

airsluf · November 1, 2004 11:22AM

Kickaha and Amorph couldn't moderate themselves out of a paper bag. Abdicate responsibility and succumb to idiocy. Two years of letting a member make personal attacks against others, then stepping aside when someone won't put up with it. Not only that but go ahead and shut down my posting priviledges but not the one making the attacks. Not even the common decency to abide by their warning (afer three days of absorbing personal attacks with no mods in sight), just shut my posting down and then say it might happen later if a certian line is crossed. Bullshit flag is flying, I won't abide by lying and coddling of liars who go off-site, create accounts differing in a single letter from my handle with the express purpose to decieve and then claim here that I did it. Everyone be warned, kim kap sol is a lying, deceitful poster.

Now I guess they should have banned me rather than just shut off posting priviledges, because kickaha and Amorph definitely aren't going to like being called to task when they thought they had it all ignored *cough* *cough* I mean under control. Just a couple o' tools.

Don't worry, as soon as my work resetting my posts is done I'll disappear forever.

imiloa · November 1, 2004 11:39AM

Quote:

Originally posted by Programmer

imiloa: the load time issue is an interesting one. Faster hardware helps, but really the most effective way to eliminate load times is rooted in software.

agreed. one load issue with modern software are all the visual assets associated with GUIs. that said, i'm a media programmer, most web games. and it's a given in my biz that you get your first UI up and running asap, streaming and configuring the rest of the app while the user is playing with the splash.

the model doesn't translate perfectly to inherently non-linear apps like word and photoshop. but by example, word could load it's text editing assets/buffers pronto to get users seeing and editing a doc. then load it's spelling dictionaries and plug-ins in a threaded manner.

ditto with photoshop, which could load the plug-ins in a threaded process after first getting the basic open file functionality running.

small changes like this could allow users to start configuring their edit session, eg: opening files, initial changes, while the full feature set comes on-line.

again, this prolly sound petty to most consumer users, but the pros in the audience know what i'm talking about.

Quote:

Of the three you list, the compile/link/test cycle is the one that is most able to be accelerated by fixing the software. 10-15 years ago we had IDEs that gave faster cycle times than the current ones... on hardware that was 10-100 times slower. Now we have more features and the same or slower cycle times. Frustrating...

definitely frustrating. my guess is that one issue is the complexity of modern OS APIs. thankfully, i last dealt with wintel programming with win 3.1. at the time, i had about a 15" (40cm) stack of paperback manuals on the API calls, about 6" (15cm) just for the GUI libs.

by comparison, when i was programming on SGIs in the early 90s (IRIX unix), my manuals covering OS, 3D hardware libs, and the "inventor" scripting language were only about 6 inches total.

i've been doing web dev (flash/shockwave/java) since '95, so out of the C/C++/C# API loop. but i can imagine winXP manuals require a full bookshelf these days. all info the compiler needs to juggle.

out of curiosity, how much of the slowdown is due to compiler logic optimization? ie: early 90s SGI compilers offered four levels of logic optimization. the fourth, most advanced level could greatly impact build time. i'm guess 10 years later, good compilers have some pretty advanced analysis tools for optimization?

btw, programmer, kudos for your service to this board! i've been reading AI since the late 90s (before the shutdown hiatus), but only recently started posting. your contributions here have always been sharp, informed, and refreshingly grounded in facts and logic.

imiloa · November 1, 2004 1:15PM

Quote:

Originally posted by AirSluf

Matter of fact the wrong "tightening" can be devastating to optimization.

not crisply clear what you mean. my guess is that you're asserting that most programmers are not as qualified/skilled at optimization as the compilers they use. that's certainly true in many cases.

rational need for assembler-level instruction programming is probably relegated to very small niches of development these days. eg: graphics card drivers, router firmware, etc...

and i definitely agree the concept of hand-specifying "register" assignments in C-level code is pretty moot. i used registers heavily in my early SGI graphics programming, but every processor has a different register interface, requiring #defines everywhere. so such decisions are better left to a well-engineered compiler.

all that said, there is no substitute for solid, factored architecture/design. i mostly do flash/lingo/java coding these days, pretty much playdoh environments. eg: no byte-level memory mgmt, direct pointer indexing, etc...

in these environments, clean well-factored design and a decent perception of how the scripts translate to bytecode are crucial to competent performance programming. ie: in playdoh land, you can't always predict how the compiler will implement your script, but you can tailor your designs to use fewer variables, stack vars in tight loops, fewer stack pushes, etc... all of which result empirically in faster performing modules.

nutshell, then. until we have AI software designers that can parse a software requirement specified in human language, and render structured code to meet it, the value of human intelligence in design and optimization remains a key factor in overall performance.

re: your cocoa editor example, imo, the best motive for using the 10-line "slower" API call version is that it will inherit any optimizations/evolutions that occur to the cocoa-based editing logic down the road.

ie: hacking an optimized version yourself can be a quick path to obsolescence. by example, consider all the system 6 mac games that wouldn't run native in system 7, due to apple rewiring the graphics card API, blocking direct-access hacks the games used. the sys 6 games ran much faster due to the hacks, but were instantly broken when access to the unsupported features was lost.

amorph · November 1, 2004 2:56PM

Quote:

Originally posted by imiloa

not crisply clear what you mean. my guess is that you're asserting that most programmers are not as qualified/skilled at optimization as the compilers they use. that's certainly true in many cases.

I think this illustrates what AirSluf was getting at:

In one of the WWDC videos, I think it was two years ago, there was a course on how to optimize for the G4. The example was the equation at the center of the flurry screensaver, which when Apple first got the code, was tersely coded and used sqrt(). So the Apple guy throws up some pretty good-looking (in the sense of easy to follow) C code. Then they started optimizing it: Unrolling loops. Using AltiVec. Hoisting loads. Working around the dismal performance of sqrt() on the G4. The lecturer would talk about what steps were necessary at each iteration to improve the speed of the code, and then display a slide with those steps implemented.

By the end of the presentation, the audience was laughing. Literally, laughing out loud. The equation had swelled to 100+ lines of highly repetitive, write-only code. But it ran like gangbusters on a G4. That code is what shipped.

The point being that no-one in their right mind would call that code tight.

Quote:

all that said, there is no substitute for solid, factored architecture/design.

This is indisputably true. A good design can moot the need for any amount of hairy low-level muckery, and improve the code in other respects as well. And as you point out with your example of System 6 games, it's a good idea to avoid hairy low-level muckery whenever possible.

Unfortunately, it's very hard to avoid the practice of explicit loop unrolling for performance-critical PowerPC code, at least until gcc gets a clue.

airsluf · November 1, 2004 3:39PM

Kickaha and Amorph couldn't moderate themselves out of a paper bag. Abdicate responsibility and succumb to idiocy. Two years of letting a member make personal attacks against others, then stepping aside when someone won't put up with it. Not only that but go ahead and shut down my posting priviledges but not the one making the attacks. Not even the common decency to abide by their warning (afer three days of absorbing personal attacks with no mods in sight), just shut my posting down and then say it might happen later if a certian line is crossed. Bullshit flag is flying, I won't abide by lying and coddling of liars who go off-site, create accounts differing in a single letter from my handle with the express purpose to decieve and then claim here that I did it. Everyone be warned, kim kap sol is a lying, deceitful poster.

Now I guess they should have banned me rather than just shut off posting priviledges, because kickaha and Amorph definitely aren't going to like being called to task when they thought they had it all ignored *cough* *cough* I mean under control. Just a couple o' tools.

Don't worry, as soon as my work resetting my posts is done I'll disappear forever.

imiloa · November 1, 2004 4:21PM

amorph, airsluf, got it. great example, re: flurry.

re: SIMD programming, also agreed. i was offered a job learning/doing playstation 2 render engine coding a few years back. after looking thru a few API specs, i politely declined.

such jobs are better suited for people with relentless passion for minute precision.

sybaritic · November 1, 2004 7:44PM

For a while at least it sounds like we're in for an eclectic dance as hardware and software manufacturers exploit untapped potential in non-CPU components. Naturally, crunching big loads (like uncompressed HD footage) will benefit from a multiple machine, shared-burden approach. Whoever thinks that we have enough economically manageable system/storage speed for that kind of chore should stare down the price of the 3.5 TB X-Serve RAID. Small multimedia design studios will want to go there if they haven't already, but they will also need to do so with a faster system (from faster CPUs, GPUs, storage capacities, and software) that doesn't break the bank.

programmer · November 1, 2004 11:13PM

Quote:

Originally posted by Amorph

The equation had swelled to 100+ lines of highly repetitive, write-only code. But it ran like gangbusters on a G4... Unfortunately, it's very hard to avoid the practice of explicit loop unrolling for performance-critical PowerPC code, at least until gcc gets a clue.

Its funny, but highly repetitive tasks are what computers are generally good at. The problem with getting GCC to do this stuff is not with GCC, its with C/C++. It is the wrong language to be expressing these algorithms in. Nobody should have to write such convoluted code, but expressing your algorithm in a higher level language which then emits either machine code or C++ code can create these kinds of optimized implementations without a human having to endure the agony. Unfortunately not a lot of work has been done in this area... yet. And when people do work on it they tend to want to contort C/C++ into attempting to do the task. Personally I find something like Sh or even GLSL much more interesting.

People laugh at the kinds of things that must be done to get peak performance from modern hardware, but they don't seem to realize that its not just the G4 and its not going to get any better anytime soon... its going to get worse. Since doing it "by hand" obviously isn't a realistic option, then we're going to need improved tools to achieve anything close to the potential performance of future hardware. This applies to SIMD as well as highly-MP systems.

Nr9 Prophecy being fullfilled?

Comments