NVidia in hot water?

1337_5l4xx0r · September 8, 2010 12:40PM

First off, let's check out semiaccurate's ancient report of NVidia's Fermi troubles:

http://www.semiaccurate.com/2010/02/...and-unfixable/

This article is from Feb 2010, but you really ought to read the whole thing. Nvidias problems are much greater than this small excerpt implies.

Quote:

A really rough measure of yield is that for similar products, the yield goes down by the square of the die size. A 200mm^2 chip can be expected to have 1/4 the yield of a similar 100mm^2 chip, and a 50mm^2 chip will have about 4 times the yield of the 100mm^2 part. Chip makers put lots of redundant structures into every design in order to repair some kinds of fabrication errors, but there are limits.

Each redundancy adds to the area of the design, so the base cost for the chip is higher. Semiconductor manufacturing is a series of complex tradeoffs, and the cost for redundant area versus yield is one of the simpler ones. If you plan right, you can make very high yielding chips with only a little extra die area.

If things go well, the cost of the redundant area is less than you would lose by not having it there at all. If things go badly, you get large chips that you can't make at anything close to a viable cost. The AMD K6-III CPU was rumored to be an example of this kind of failure.

Last spring and summer, ATI was not shy about telling people that the lessons learned from the RV740 were fed back into the Evergreen 5000 series chips, and it was a very productive learning experience. One of the deep, dark secrets was that there were via (interconnects between the metal layers on the chip) problems. The other was that the TSMC 40nm transistors were quite variable in transistor construction, specifically in the channel length.

Since Anand talked about both problems in his excellent Evergreen history article, any promises to keep this secret are now a moot point. What ATI did with Evergreen was to put two vias in instead of one. It also changed transistor designs and layout to mitigate the variances. Both of these cost a lot of area, and likely burn a more than negligible amount of energy, but they are necessary.

Nvidia on the other hand did not do their homework at all. In its usual 'bull in a china shop' way, SemiAccurate was told several times that the officially blessed Nvidia solution to the problem was engineering by screaming at people. Needless to say, while cathartic, it does not change chip design or the laws of physics. It doesn't make you friends either.

By the time Nvidia found out about the problems, it was far too late to implement them in Fermi GF100. Unless TSMC pulled off a miracle, the design was basically doomed.

Why? GF100 is about 550mm^2 in size, slightly larger than we reported after tapeout. Nvidia ran into severe yield problems with a 100mm^2 chip, a 3 month delay with a 139mm^2 chip, and had to scrap any larger designs due to a complete inability to manufacture them. Without doing the homework ATI did, it is now trying to make a 550mm^2 part.

Basic math says that the GF100 is a hair under 4 times as large as the G215, and they are somewhat similar chips, so you can expect GF100 yields to be around 1/16th that of the smaller part. G215 is not yielding well, but even if it was at a 99 percent yield, you could expect Fermi GF100 to have single digit percentage yields. Last time we heard hard numbers, the G215 was not yielding that high.

Second, let's see what Fermi does when all the GTX 480 compute power is turned on:

http://en.expreview.com/2010/08/09/w...-480/9070.html

Quote:

According to the test, the full spec’ed 512SP GTX 480 just brings a performance improvement of no more than 6% over the 480SP GTX 480.

The power consumption and temperature of GF100 have become NVIDIA’s headache. Even without overclocking, the full-load power consumption reached a horrible 644W, which was 204W than the 480SP model with the same clocks. The full-load temperature was 94℃ even with help of the outstanding Accelero Xtreme Plus cooling solution. But its noise was lower than the reference model.

Third, let's consider Intel and AMD are integrating their own GPU solutions on-die, choking them out of the IGP market entirely. It's an IGP race Intel can't hope to win, but they persist.

Fourth, let's consider rumours of NVidia developing their own CPU, which is supposedly ARM with a Transmeta-inspired x86 front end, and the speed and efficiency that (does not) implies:

http://www.semiaccurate.com/2010/08/...idias-x86-cpu/

Quote:

On the technical side, the problem is simple, speed. ARM A9 CPUs are great for phone level applications, and can reach into the current tablet space, but hit a glass ceiling there. If Eagle doubles the performance per MHz and doubles performance per watt, it will basically be on par with the low end of the Atom-class CPUs, and woefully behind the Nano/Bobcat level of performance.

The question: Are the odds of NVidia recovering any time soon good? It appears they are fooked for the next 18 months at least. Their GTX 450 scores signifigantly lower than equivalent GPUs from their previous line up. Fermi doesn't have legs, and NV supposedly loses money on each power hungry, expensive 470 and 480 sold.

Outside of the GTX 460, they don't seem to have appealing products in their portfolio.

And the Radeon 5000 series is so full of win, that ATI/AMD could take the next year off and still clean up in the GPU market.

But the Radeon 6000 series debuts in a month...

hiro · September 8, 2010 12:59PM

Well Nvidia essentially stole the talent from SGI, a company that collectively got fat dumb and happy after early success. Hubris caught up to SGI and made them irrelevant, it is looking eerily like history repeating itself here too.

1337_5l4xx0r · September 8, 2010 1:15PM

Here's benchmarks of the GTX 260 pounding the crap out of the new, exciting GTS 450.

http://www.overclock.net/hardware-ne...cond-full.html

Outside of the pro market, ie: photoshop/cuda and insanely expensive pro cards for 3D, I don't see where they are going.

Is Fermi the new Itanium?

seek3r · September 8, 2010 10:20PM

Quote:

Originally Posted by 1337_5L4Xx0R

Here's benchmarks of the GTX 260 pounding the crap out of the new, exciting GTS 450.

http://www.overclock.net/hardware-ne...cond-full.html

Outside of the pro market, ie: photoshop/cuda and insanely expensive pro cards for 3D, I don't see where they are going.

Is Fermi the new Itanium?

They're doing quite well in the supercomputing arena too, I'm working with some tesla cards right now and they really are fantastic pieces of hardware for anything that does lots of FP calculations.

wizard69 · September 9, 2010 5:30AM

some of the materials referenced here are questionable.

For example everything I've seen are indicating very good and very low power performance out of Cortex A9. For some tasks it should do very well when put up against ATOM. Look on the net for info on Sansungs new Orion chip for example. In many circles ATOM is a bit of a joke, way to much power (watts) used for power (performance) delivered. The other thing with Cortex is that people have already produced units with 22nm feature sizes, there is a lot of interest in ARM right now with everybody trying to up performance and lower power usage.

As to some of the other references, what was the issue at the end of last year does not mean it is an issue now. Remember much of what gets leaked is often the perspective of one person. To imply that we have the big picture about yield issues is wishful thinking.

Now about NVidia implementing an ARM based SoC. If they are smart about it they could be very successful going after the embedded market. I say embedded because there is a much wider market than just tablets for such devices. 32 bit devices still have legs in this space. The problems for NVidia are this, the market is cost sensitive and I don't think they have a chance in hell of getting the feature set right.

NVidias problem though isn't Intel, but rather AMD. If the info currently available holds up in shipping products Fusion is going to change the market place drastically. Neither Intel nor Nvidia really have an answer for Ontario which is dual core + GPU at under 9 watts in some versions. That is likely to be twice the power of an ARM based solution but you get 64 bit i86 that is reasonably fast.

In the end I suspect NVidia will be purchased for patents and talent. Their hardware and software stack use to be pretty good, but then they started pushing these solutions with excessive thermals. I just don't think the demand is there to burn power like that. Sure there are exceptions but given a choice I suspect most people will take the cooler high performance chip these days. That would be AMD. Especially now that AMD has become much better with their software stack.

hiro · September 9, 2010 5:26PM

Quote:

Originally Posted by wizard69

In the end I suspect NVidia will be purchased for patents and talent.

Wow, even a bit more morbid than me! I think Nvidia can survive one single f-up release, but the true test will be how quickly they get to what the big market really wants -- performance per watt. I don't know that they can shift fast enough because the pretty much avoided that aspect for the past 5 years, but I think they have a sliver of a chance if Huang can pull his head out of his arse. And Tegra isn't the answer, Nvidia hasn't previously been a CPU company, they have a long way to go to get established there and it won't hold up the rest of the sinking ship. They either freshen up in their GPU design philosophy or else.

1337_5l4xx0r · September 10, 2010 2:52AM

I agree. The reason they've been getting served by AMD/ATI is not what the GPUs do, but what they do at a given power envelope. That's why the 5xxx radeons still outsell the fermi boards, and the radeon 6xxx leaves nv with nowhere but price cuts they can't really afford.

The 512 shader GTX 480 will likely never see the light of day... because the power it draws is outside the rather generous PCIE spec. It's like they took a bad idea and overclocked it until it just beat the 5870.

Hopefully their next rev will take into account massive learning experiences from this one. Like a Pentium 4-> Core transition for GPUs.

programmer · September 10, 2010 10:12AM

Quote:

Originally Posted by 1337_5L4Xx0R

Hopefully their next rev will take into account massive learning experiences from this one. Like a Pentium 4-> Core transition for GPUs.

There is a fascinating parallel with the Intel experience, isn't there? nVidia got caught by the same problem, whereas ATI was bought by AMD... who had just finished successfully navigating their way around the power wall. ATI had been on track to run into the same power problems just like nVidia, and I wonder if the merging of the two companies was why they shifted their emphasis to performance/watt early and dodged the bullet? It gives them time to open a lead in the market. Intel has closed that gap on AMD in the CPU business, but will nVidia have the staying power to recover from this and catch up to AMD? I'm a bit doubtful and suspect that wizard69 will be right, especially given their x86 gambit.

wizard69 · September 11, 2010 6:14PM

Quote:

Originally Posted by 1337_5L4Xx0R

I agree. The reason they've been getting served by AMD/ATI is not what the GPUs do, but what they do at a given power envelope. That's why the 5xxx radeons still outsell the fermi boards, and the radeon 6xxx leaves nv with nowhere but price cuts they can't really afford.

Power usage is certainly a big factor. Personally I can't understand the need for 500 or more watt GZpUs on the desktop. For cuda in HPC installations there may be a point but those intallations won't keep NVidia afloat. Especially considering the perpetual state of super computer builders.

There is more to it than that though. NVidias advantages with respect to drivers is slowly eroding away and in some cases is compleyely gone so there is one less reason to put up with NVidia. Put up with companies have for sometime and frankly I think many are tired of NVidia management. If your products are no longer light years ahead of the competition you have to play the game differently. Right now NVidia is significantly behind AMD in some respects.

So who do you work with as a PC builder? AMD has certainly put much effort into restructuring ATI for the future. Even though it took time and still has a ways to go, they put significant effort into their drivers. Nvidia on the other hand ships high tech space heaters.

Quote:

The 512 shader GTX 480 will likely never see the light of day... because the power it draws is outside the rather generous PCIE spec. It's like they took a bad idea and overclocked it until it just beat the 5870.

I don't know about overclocking as i don't know what the design goals where. Even if the video card needed extra power lines directly from the power supply that isn't the problem as I see it.

The problem is the competition is doing as good at a much lower power level. That can only mean NVidias engineering/management made some very bad decisions.

Quote:

Hopefully their next rev will take into account massive learning experiences from this one. Like a Pentium 4-> Core transition for GPUs.

I seriously doubt it. For one AMD is about to introduce an incremental spin. Second AMD could target their 32nm process for the next high performance GPU rev. Or maybe go even smaller. Plus I suspect AMD is learning a lot from its Fusion project that could be applied to discrete GPUs. It will be very difficult for NVidia to catch up to AMD today much less two years down the road.

As far as those patents and so forth it would be pretty funny to see Apple buy out NVidia and use those patents as leverage to get Intel to refocus. If Intel doesn't they could simply license all of NVidias tech to AMD and Imagination. Apple could really bend Intel over here.

There are significant differences between the Pentium 4 case and NVidias current hole. For one Intel already had the resources in place engineering low power laptop components. Prntium 4 may have been the result of short sighted mangement decisions coupled with bad marketing but it was due to a lack of technical capability. On top of that two other things happened.

One is people starting to realize a computer could be the equivalent of a room full of lights left on 24/7. I don't align myself with the green movement and some of the things they promote but massive power waste is stupid. Once people became sensitive to all that wasted energy the desire to own these power mongering PCs dropped significantly.

Multi core technology became understood mainstream. By that I mean most people realized they got a better user experience with multi core computers and OS's to drive them. Suddenly the race was for clock rate nor even ultimate core performance.

Rather the race became one where threads mattered along with power usage. Intel was simply able to retask a design team to address these issues. NVidia doesn't have this capability and it isn't clear they realize where their troubles lay.

1337_5l4xx0r · September 13, 2010 10:01AM

TSMC is skipping 32nm and heading straight to 28, IIRC. Hence the non-die-shrunk radeon 6000 series due in a few weeks; 32nm is delayed... forever. Taiwan Semiconductor (TSMC) make both NV and ATI GPUs.

Some good news for Sandy Bridge, spun as bad news for NV:

http://www.semiaccurate.com/2010/09/...s-dark-secret/

Video transcoding, a strength of CUDA in the consumer space, is getting dedicated ICs in Sandy Bridge.

talksense101 · September 14, 2010 8:11AM

The last Anandtech article testing the new nVidia cards clearly shows the issue with power consumption and heat. It is a no brainer to buy AMD/ATI right now. I am also thinking of making a heat sink for nVidia cards that double as a room heater, a coffee maker...

wizard69 · September 14, 2010 10:16AM

Quote:

Originally Posted by talksense101

The last Anandtech article testing the new nVidia cards clearly shows the issue with power consumption and heat. It is a no brainer to buy AMD/ATI right now. I am also thinking of making a heat sink for nVidia cards that double as a room heater, a coffee maker...

A no brainer is exactly it. The only way the average consummer would go NVidia right now is if they had no concern about their electric bill. In many ways the AMD solutions perform as good or better while being significantly cooler.

On top of all of that AMD is slowly let out info on their Fusion products. While the initial devices are functionally lower end one can't dismiss a few things. One important element here is the remarkable low power achievement for a device with servicable graphics and CPU resources. While we could all wish for better performance, Zacate and Ontario highlight the benefits of agressive power management.

At the same time that AMD is targeting low power they quickly eliminate a market for NVidias lower end components. With Intel effectively doing the same thing, NVidia looses both its high end and low end markets. It is not yet understood how much flexibility AMD is going to offer Zacate implementers but if the graphics core can be bumped speed wise, by either the manufacture or the overclocking user, the device would likely serve a wide range of users well.

One final consideration here. If AMD is smart this product would be offered up for full custom treatment. Zacate with purpose built I/O on the same die seems to be both doable and economical due to the die size. Machines with no north bridges at all then become viable. Even if they don't do custom SoC's AMD had better be thinking about a higher integration device right now. Apple could easily make a Mini that is half the current Minis size.

In any event NVidia is screwed and there is nothing they can do about it.

nvidia2008 · September 18, 2010 7:09AM

Quote:

Originally Posted by 1337_5L4Xx0R;

TSMC is skipping 32nm and heading straight to 28, IIRC. Hence the non-die-shrunk radeon 6000 series due in a few weeks; 32nm is delayed... forever. Taiwan Semiconductor (TSMC) make both NV and ATI GPUs.

Some good news for Sandy Bridge, spun as bad news for NV:

http://www.semiaccurate.com/2010/09/...s-dark-secret/

Video transcoding, a strength of CUDA in the consumer space, is getting dedicated ICs in Sandy Bridge.

Radeon 6000s look nice and finally decent cards to fit in between the $250 and $150 space. The Radeon 5850 is great but has been at its price point for too long.

As for Nvidia, the 450 didn't even come close to replicating the success of the 460 which is Nvidia's only bright spot right now. I don't know about their financials but prospects are not good.

nvidia2008 · September 18, 2010 7:20AM

Quote:

Originally Posted by 1337_5L4Xx0R;

I agree. The reason they've been getting served by AMD/ATI is not what the GPUs do, but what they do at a given power envelope. That's why the 5xxx radeons still outsell the fermi boards, and the radeon 6xxx leaves nv with nowhere but price cuts they can't really afford.

The 512 shader GTX 480 will likely never see the light of day... because the power it draws is outside the rather generous PCIE spec. It's like they took a bad idea and overclocked it until it just beat the 5870.

Hopefully their next rev will take into account massive learning experiences from this one. Like a Pentium 4-> Core transition for GPUs.

But this is twice Nvidia hasn't learned. My username is from the 2007-2008 era where the Nvidia 8000 series was kicking major butt. 9000s were slight improvements, and IIRC 8000s and 9000s in laptops had major advantages over ATI thanks to the Nvidia G92 chip. Then what was first named the 200 gtx series had literally the same problem as Fermi... Too big, too hot, too hard to come down from a higher nm process, not good enough yields, etc etc.

So this is twice in a row Nvidia has fumbled the ball. If not for Cuda, PhysX and their really strong marketing especially getting low end dedicated and integrated Nvidia GPUs into laptops... Nvidia could have been much worse off.

Marvin · September 18, 2010 8:46AM

Quote:

Originally Posted by nvidia2008

So this is twice in a row Nvidia has fumbled the ball. If not for Cuda, PhysX and their really strong marketing especially getting low end dedicated and integrated Nvidia GPUs into laptops... Nvidia could have been much worse off.

Their marketing has definitely helped, they have a really strong brand. One of the most surprising marketing achievements is Optimus. That's Nvidia basically saying their GPUs don't ramp down to a low-enough power level so they put in software that turns their GPU off to let you use Intel's integrated graphics and yet it's branded NVidia Optimus. It should be called NVidia not-Optimus-enough-so-use-Intel's-IGP-instead. But that wouldn't fit on the little sticker.

It'll be interesting to see what happens to the company long-term. I think Intel and AMD will push ahead with the fused CPU/GPU chips and lock NVidia out. They will sell the dedicated chips but it will diminish the faster that IGPs get and also the smaller manufacturers make machines. I think AMD and Intel would rather see NVidia just fade out rather than bother with a buyout. They might hire some of the staff but that's about it.

programmer · September 18, 2010 12:08PM

Quote:

Originally Posted by Marvin

I think AMD and Intel would rather see NVidia just fade out rather than bother with a buyout. They might hire some of the staff but that's about it.

No, they (and others) will squabble and bid over the patent portfolio that is buried within nvidia's dead and rotting corpse.

Should that come to pass, of course.

NVidia in hot water?

Comments