And the bottleneck is?

clive · August 22, 2002 4:11PM

I've read some opinions at various sites (by people who may or may not know what they are talking about) that indicate that the bottleneck in current Macs is the MaxBus, which has a throughput limit of 1Gb/s (maybe 1GB/s!?). And that, seemingly near enough any current G4 can flood that, no question.

Hmm.

If this is true, then it poses a couple of questions:

If, factually, a G4 can flood the MaxBus, is it actually likely to with any regularity? ie if I'm running a renderer or hammering Photoshop am I going to be taking the MaxBus to the limit throughout the process?

If I am, then doesn't it follow that faster G4s aren't necessarily going to result in faster "real world" results? ie in a traffic jam my Mustang is as fast as Flashman's Ferrari?

Are there other issues involved?

[ 08-22-2002: Message edited by: Clive ]

blabla · August 22, 2002 4:21PM

Car analogies SUKS!!

And this is hardly FH.

Depending om image size, i dont you would see a significant speed-up by using a double pumped FSB. At least not for the 2 Mb ddr-sdram models. Much of the image would be in the cache so.. But there are other applications that would benefit much more from increased bandwidth.

spart · August 22, 2002 4:35PM

[quote]Originally posted by blabla:

Car analogies SUKS!!

And this is hardly FH.

Depending om image size, i dont you would see a significant speed-up by using a double pumped FSB. At least not for the 2 Mb ddr-sdram models. Much of the image would be in the cache so.. But there are other applications that would benefit much more from increased bandwidth.<hr></blockquote>

Who the hell works with an image with a scratch size of less than two megs?

In any event, the G4 can fill it up pretty well (and if both processors are going at full blast, I believe, they will have to do a bit of sharing.)

Someone want to back me up or throw me down? Programmer? <img src="confused.gif" border="0">

blabla · August 22, 2002 4:41PM

[quote]Originally posted by Spart:



Who the hell works with an image with a scratch size of less than two megs?

<hr></blockquote>

Well, when you are running a gaussian blur filter, you are not working on all the datas in scratch. Right????

You got something like 4 MB of 3-level cahce and 512 Kb 2-level cache. No need to be a genious to see that DDR-sdram wont bring magic in a scenario like that.

Photoshop is NOT the application where you will see the biggest benefit.

yevgeny · August 22, 2002 4:45PM

[quote]Originally posted by Spart:



In any event, the G4 can fill it up pretty well (and if both processors are going at full blast, I believe, they will have to do a bit of sharing.)

Someone want to back me up or throw me down? Programmer?

<hr></blockquote>

Yes, you can very easily overwheml the 167 MhZ MaxBus. This does not always mean that you overwhelm the bus. This does not mean that the second processor is useless. This does not mean that the extra 33MHz of bus speed is a useless addition.

airsluf · August 22, 2002 5:46PM

clive · August 23, 2002 7:42AM

[quote]Originally posted by Yevgeny:

Yes, you can very easily overwheml the 167 MhZ MaxBus. This does not always mean that you overwhelm the bus. This does not mean that the second processor is useless. This does not mean that the extra 33MHz of bus speed is a useless addition.<hr></blockquote>

Well this is pretty much what I'm thinking. But the facts, at the end of the day, are that those wanting that performance are also those that are most likely to flood the bus - people doing rendering and image manipulation.

This being the case it does begin to sound like, in those situations, a dual 1.25GHz G4 isn't such a good deal compared to a dual .867GHz.

Anyone seen any processing tests on 50MB Photoshop files?

rogue27 · August 23, 2002 8:54AM

Well, if FSB is the only bottleneck, the dual 1.25 has two advantages over the dual 867.

The 25% faster FSB would translate into 25% faster performance if the bus was the only bottleneck. The extra 1MB of L3 cache per processor would also mean that there wouldn't need to be as muc data sent down the bus, thus improving performance further.

1MB may not sound like a lot, but it is very helpful. You do not do one calculation on every bit of data and sent it back into memory. Typically a filter or something of that nature will be a tight loop where several calculations will be done on an area. If that loop can fit into the cache, that greatly improves performance and reduces FSB bandwidth usage if parts of the loop don't have to be constantly swapped between cache and main memory over and over again.

tht · August 23, 2002 10:13AM

Originally posted by rogue27:

Well, if FSB is the only bottleneck, the dual 1.25 has two advantages over the dual 867.

The 25% faster FSB would translate into 25% faster performance if the bus was the only bottleneck. The extra 1MB of L3 cache per processor would also mean that there wouldn't need to be as muc data sent down the bus, thus improving performance further.

Lets not say it this way, if a computation is memory bound, 25% FSB = 25% faster performance, that is. Increase bandwidth rarely ever, I can't think of many benchmarks I've seen, translate into an equal percentage performance increase. There is an increase, but it's not that much. Maybe 10 to 15%. There are many other factors in performance that reduces the theoretical benifits one way or another.

Cache hit rates on L2 are very high (in the 90% range), so a 1.25 GHz processor will be around 40% faster that a 0.867 GHz processor, on memory bound apps. The single biggest advantage of the 1.25 GHz over the 0.867 GHz processor is the extra 0.380 GHz clock cycles and the extra 6 GBytes/sec of L2 bandwidth going to the execution units.

groverat · August 23, 2002 10:25AM

[quote]Originally posted by THT:

The single biggest advantage of the 1.25 GHz over the 0.867 GHz processor is the extra 0.380 GHz clock cycles and the extra 6 GBytes/sec of L2 bandwidth going to the execution units.<hr></blockquote>

The second being better utilization of the speed holes.

Don't quote me on it, though, but Very Important People have told me speed holes add 5% to overall performance.

[ 08-23-2002: Message edited by: groverat ]

tht · August 23, 2002 11:16AM

Originally posted by groverat:

The second being better utilization of the speed holes.

Don't quote me on it, though, but Very Important People have told me speed holes add 5% to overall performance.



I believe ya, man. As is evident in jet aircraft, optimization of the air intakes, speed holes, can increase the amount of flow going in the machine which further increases its performance. Especially at higher altitudes!

rogue27 · August 23, 2002 2:04PM

Maybe you weren't reading my post, but I said "*If* FSB bandwidth was the only bottleneck..."

And thus my post is true.

However, that is not the only thing that affects performance and I was not claming that it was. I was just responding to what the poster above me said.

imud · August 23, 2002 2:47PM

[quote]Originally posted by groverat:



The second being better utilization of the speed holes.

Don't quote me on it, though, but Very Important People have told me speed holes add 5% to overall performance.

<hr></blockquote>

So if I take my handy black and decker cordless drill and punch 20 speed holes I'll end up with a 100% overall performance? COOL! Hmm any particular patern? Or should I go for that swiss cheese/buckshot look?

Looks like the current case will only hold 11 more speed holes, but hey thats a 55% increase right?

[ 08-23-2002: Message edited by: iMud ]

tht · August 23, 2002 2:59PM

Originally posted by rogue27:

Maybe you weren't reading my post, but I said "*If* FSB bandwidth was the only bottleneck..."

And thus my post is true.

However, that is not the only thing that affects performance and I was not claming that it was. I was just responding to what the poster above me said.

Yes. But lets not contribute to the notion that the FSB is what is holding Apple back in getting uber-performance. Getting a DDR FSB, doubling its theoretical bandwidth will not yield a doubling of peformance in memory bound apps. It'll give the same clock speed processor maybe 10 to 15%, maybe.

We are not even sure what apps are memory bound...

anders · August 23, 2002 3:08PM

[quote]Originally posted by iMud:



So if I take my handy black and decker cordless drill and punch 20 speed holes I'll end up with a 100% overall performance? COOL! Hmm any particular patern? Or should I go for that swiss cheese/buckshot look?

Looks like the current case will only hold 11 more speed holes, but hey thats a 55% increase right?

[ 08-23-2002: Message edited by: iMud ]<hr></blockquote>

Hey its a Mac you are talking about

You can´t just make holes as you like. It has to be aesthetic correct holes or it simply doesn´t work.

That excatly where the "form over function" yellers are wrong. Part of the PPC specification is its interaction with its surrounding design. Remember the PPC is elegant CPU design and put into as something ugly as you propose will make it worker even slower.

Ever thought about the ugly boxes they put Intel and AMD chips in and the bad wiring? Its nessesary for those chips to function and not a choice of taste.

imud · August 23, 2002 3:47PM

OW! Anyone got some ice? Someone left a

mark on my posterior. <img src="graemlins/bugeye.gif" border="0" alt="[Skeptical]" />

airsluf · August 24, 2002 4:01PM

moogs · August 25, 2002 2:55PM

No bottleneck to see here people ... move along now, move along. It's all just a bunch of hype planted by Microsoft lackies in the press.

That speed hole joke - that wasn't funny. And car anologies are all wrong because it really ANGERS those of us who would like to believe, so no more car analogies - even if they are perfectly valid!

And the bottleneck is?

Comments