Quote:
Originally Posted by
1st 
"Nvidia KNEW what the composition of the materials were from the beginning." - they better do... if it is their design. Material and Process are closely linked. The selection of materials determine the thermal process limitation of the device, the conductivity, even its performance (mechanical stress distribution, reliability under specific environment, etc).
Yes they are related. That's why people in the industry are wondering why this was done.
Quote:
"They CHANGED the solder from tin to lead base to lower costs". - The change form tin to lead is definitely not due to cost. The cost differential would be minimum, if not higher for the high lead solder. Are you sure?
Yes, I am sure. There are too many independent reports coming from suppliers as well as OEM's of Nvidia to be in doubt. At first, Nvidia claimed that the problem was from one sub of theirs, but it turned out that it was coming from all of them.
Tin is more expensive than lead. The same for solder formulations. The difference isn't that much on this scale for a manufacturer though.
But from my own experience, I know that you do look down that road of infinitesimal costs adding up over long runs. All of Nvidia's chips from GPU's to chipsets were changed. That's tens of millions of chips. That does add up.
It almost seems as thought they had several teams working on these products that weren't kept aware of the changes in this one area of the process, and so didn't look to the problems that would ensue.
This isn't the first time in manufacturing that something like this has happened, and it won't be the last.
While that price chart is now old, the relative price differential hasn't changed much. You can see that the price of tin is about eight times the price of lead. Tin based solder also includes about 5% copper, which is also somewhat expensive.
Quote:
"They also knew the bumps were tin." - Are you sure it is tin? why tin is better than lead? how could it impact performance? specifically, impact on thermal properties? why you so convinced it is the bump causing the problem? Any crack of the bump? Under what condition?
Yes, it's tin.
Everything is now moving to tin based solder. We even moved to tin base here in the US quite a few years ago for potable drinking water syatems. But that solder also contains elements that raise the melt.
Tin based solder for electronics lowers the melt. Eutetic solder has a melt temp quite close to the softening temp, and is more predictable.
Tin is better because of the lower temp, and also because tin is not harmful to the environment as lead is.
I don't know if you need a lesson on how soldering works (or if anyone else is interested), but I have to assume something, so here is a bit.
Normally, a surface is "tinned". This is an old term that comes from hand soldering metals with a fire heated "copper" used to solder sheets of metals to make almost anything. You can see that on the leads of electrical components. That protects the metal inside, and "wets" the surface. That means it becomes easier to solder. Normally, there is a coat no more than a few thousanths thick. When you solder, even with lead/tin solder, the solder melds into that thin layer.
With wires, either they are in holes, which provide strength, or are wrapped around something, which provides strength, and the solder is just for electrical purposes.
With surface mount devices, there is no such mechanical reinforcement. The solder joint provides all the mechanical strength. Normally, this is fine. Surface mount devices are designed for that in mind.
When chips are soldered to boards this way, most often, when they are more complex chips like CPU's or GPU's that can get hotter at some of those joints, bumps are used. The bumps help the soldering (usually with a wave soldering machine, hot air, or other speciality unit) process, as it provides just the right amount of solder, and enough mass for the connection to take the current, and heat. Also, unlike devices such as resistors, caps, small chips, etc, CPU's, GPU's, and other larger chips have so many connections, that many, if not most, are under the chip. There is no way to get solder there, unless it's already present on the board, and chip. Another reason why this is so critical. Soldering between the chip and the board is difficult, and delicate.
This is a pretty well understood technology.
The bumps and pads must use the same solder. With regular soldering, it's not that important, because the way it's done will meld the entire amount of solder together into one mixed mass.
But here, that works differently. There is just enough heat applied to do the melt.
If there is one type of solder, no problem. The two halves melt together at the surfaces, and become one mass of the same material.
But if the two materials are different, the situation is also different.
What happens then depends on a number of things.
As lead based solder and tin based solder have different functioning temperatures, the question of what temperature was used comes into play. Was it the lower one used for tin, or the higher one used for lead? Most surface mount joints can withstand about 2.75 seconds of heating, no more. So what was the dwell time? It's longer for lead than for tin.
At any rate, what will happen is that as the surfaces melt together, there is just a thin layer of melt mixing.
If the lower temp was used, the lead may not go into a melt state, may begin to lose crystallization, and may end up being soldered TO, rather than being soldered INTO. See the difference?
As some joints heat up during extreme periods of operation, the different expansion coefficients will then come into play.
We don't know the answer here.
I can't tell you what the coefficients are, because I don't know the particular alloys involved.
Suffice to say that they are different. What has apparently happened is that as they expanded, either the joints in question either tore from shear, or "popped" from lack of tensile strength between the two solder types.
That's why they have to examine the joints themselves.
We know enough to get a good idea of what happened, but not enough to know exactly what happened.
As to why, cost cutting, and stupidity is usually to blame for problems like this.
For the past two years, ATI has been eating Nvidia's clock. Nvidia has been having major financial problems. This was one way to cut costs. They just didn't follow through.