There are a lot of opinions regarding shimming of laptop GPUs after reflowing or reballing. I'm for it, but i've seen plenty done wrong:
Here's a Dell Vostro 1510 i picked some time ago for cheap, had never been worked on and it's in pretty good nick. Since i was out of new chips, i pulled the existing GPU, reflowed the die only at 260C, then reballed it and soldered it back to the board. It works, but i would like to keep it working for as long as possible without requiring another intervention as the board quality is crap, and i had to be very careful not to break traces or lift pads while reworking. I am going to get a new GPU ordered but i'd rather just keep it around if need be. So i needed something proper to replace the stock rubber.
There is one important thing you have to know about the defective nVidia chips, besides that it's a bad idea to have them get any close to 80 degrees C due to the wrong underfill choice. This is only part of the problem. What i believe is that to save on die area, nVidia ran more current thru the bumps than they should have, and the damage occurs due to internal heating caused by the amount of current flowing thru (some of the) bumps, rather than just absolute temperature. Of course, it gets worse as absolute temperature increases, too, due to the increase in resistance. This goes on until part of the chip goes into thermal runaway and the bumps crack and separate the die from the substrate.
But still, they are not so much damaged by absolute maximum temperatures rather than fast hot/cold thermal cycling, which is why the desktop parts had a much lower failure rate and longer lifespan than the mobile ones, albeit having the same manufacturing defect and being very similar parts (and in some cases, exactly the same).
So, to keep these chips alive, one must ensure that they heat up slow and also cool down slow. The reason very few CPUs die compared to GPUs is exactly this - the part of the heatsink that covers the CPU is always engineered to fit exactly on top of the die and always uses paste for heat transfer instead of rubber or foam. Also, the transfer surface between the CPU die and the heatpipe is always a copper plate, with significant surface area and thickness, compared to the size of the CPU die and the size of the heatpipe. This creates thermal mass, it can uniformly store a larger amount of heat, and transfer it more efficiently to the heatpipe, than if the heatpipe was placed directly on top of the die. Note that the heatpipe is always at most the size of the die(s), and usually smaller.
By contrast, GPU cooling solutions usually use a little heatpipe (or the same heatpipe as the CPU), placed on top of a little piece of aluminum (copper, only if you're lucky), then soldered to the copper heatpipe and at a significant distance from the GPU die. Thermal transfer is done thru foam or rubber. As you can see, there are a couple thermal resistances involved here:
Also, the foam type pads tend to wear out, crack and crumble over time.
And finally, a smaller thermal mass means localized heating will occur in the GPU area instead of the heat being distributed evenly across the heatpipe and making its way to the exhaust of the fan. Although most mobile GPUs have significantly lower power consumption than CPUs, they almost universally run hotter because of this construction.
So, to keep the GPU happier and out of trouble, we should treat it like a CPU. More thermal mass, less thermal resistance. Here's my take on it:
The "shim" in this case is a copper plate which used to be the transfer area from CPU to heatpipe in another laptop. Unsoldered it from the heatpipe and its supports, then sanded down nicely until reasonably flat. You can still see a thin layer of solder on the top side, where the heatpipe used to go.
IMHO, heatsinking should be done in a way that allows the use of cheap white goop for thermal paste. And the way i did it allows exactly this. Not that i wouldn't recommend the use of high quality paste. But besides being a cheapskate, i also believe that if the 2-3 deg C improvement brought by better thermal paste is what keeps a laptop alive, it means the cooling system has serious issues.
- Really thin, small shim (sub-0.5mm), placed over the stock thermal rubber - nearly useless. Same as stock thermal rubber.
- Same sub-0.5mm shim, with paste on both sides - even worse than the stock rubber. Distance from chip to heatsink is larger than that, and paste is a very poor heat conductor when the surfaces are not very close together.
- 1mm to 1.5mm shim the size of the die, with paste: Good. But it can get a bit better.
Here's a Dell Vostro 1510 i picked some time ago for cheap, had never been worked on and it's in pretty good nick. Since i was out of new chips, i pulled the existing GPU, reflowed the die only at 260C, then reballed it and soldered it back to the board. It works, but i would like to keep it working for as long as possible without requiring another intervention as the board quality is crap, and i had to be very careful not to break traces or lift pads while reworking. I am going to get a new GPU ordered but i'd rather just keep it around if need be. So i needed something proper to replace the stock rubber.
There is one important thing you have to know about the defective nVidia chips, besides that it's a bad idea to have them get any close to 80 degrees C due to the wrong underfill choice. This is only part of the problem. What i believe is that to save on die area, nVidia ran more current thru the bumps than they should have, and the damage occurs due to internal heating caused by the amount of current flowing thru (some of the) bumps, rather than just absolute temperature. Of course, it gets worse as absolute temperature increases, too, due to the increase in resistance. This goes on until part of the chip goes into thermal runaway and the bumps crack and separate the die from the substrate.
But still, they are not so much damaged by absolute maximum temperatures rather than fast hot/cold thermal cycling, which is why the desktop parts had a much lower failure rate and longer lifespan than the mobile ones, albeit having the same manufacturing defect and being very similar parts (and in some cases, exactly the same).
So, to keep these chips alive, one must ensure that they heat up slow and also cool down slow. The reason very few CPUs die compared to GPUs is exactly this - the part of the heatsink that covers the CPU is always engineered to fit exactly on top of the die and always uses paste for heat transfer instead of rubber or foam. Also, the transfer surface between the CPU die and the heatpipe is always a copper plate, with significant surface area and thickness, compared to the size of the CPU die and the size of the heatpipe. This creates thermal mass, it can uniformly store a larger amount of heat, and transfer it more efficiently to the heatpipe, than if the heatpipe was placed directly on top of the die. Note that the heatpipe is always at most the size of the die(s), and usually smaller.
By contrast, GPU cooling solutions usually use a little heatpipe (or the same heatpipe as the CPU), placed on top of a little piece of aluminum (copper, only if you're lucky), then soldered to the copper heatpipe and at a significant distance from the GPU die. Thermal transfer is done thru foam or rubber. As you can see, there are a couple thermal resistances involved here:
- Resistance of the thermal foam or rubber used (all thermal pads are not born alike);
- Resistance of the aluminum to copper connection.
Also, the foam type pads tend to wear out, crack and crumble over time.
And finally, a smaller thermal mass means localized heating will occur in the GPU area instead of the heat being distributed evenly across the heatpipe and making its way to the exhaust of the fan. Although most mobile GPUs have significantly lower power consumption than CPUs, they almost universally run hotter because of this construction.
So, to keep the GPU happier and out of trouble, we should treat it like a CPU. More thermal mass, less thermal resistance. Here's my take on it:
The "shim" in this case is a copper plate which used to be the transfer area from CPU to heatpipe in another laptop. Unsoldered it from the heatpipe and its supports, then sanded down nicely until reasonably flat. You can still see a thin layer of solder on the top side, where the heatpipe used to go.
IMHO, heatsinking should be done in a way that allows the use of cheap white goop for thermal paste. And the way i did it allows exactly this. Not that i wouldn't recommend the use of high quality paste. But besides being a cheapskate, i also believe that if the 2-3 deg C improvement brought by better thermal paste is what keeps a laptop alive, it means the cooling system has serious issues.
Comment