Hi,
I have an Nvidia card which makes me wondering, what is the root cause for Mats errors. The story is as follows:
1. When I received the card Mats showed one memory with errors.
2. Replaced the memory chip, but no change, still same error, so memory chip is ruled out.
3. Made a carefull reballing of the GPU, MATS showed 3 memories with errors, the one before, and 2 other new chips.
4. Made a second reballing of the GPU and Mats shows again only one memory chip with errors, the same on as in the beginning.
5. Made again a reballing, and again the same 3 memory chips show errors.
I Wonder what is the root cause for this unpredictible results. I have four theories:
A. The GPU is some way broken, internal bondings (connection between silicon structure and carrier pcb plate with pads made with very thin golden wire) not reliable or broken. Heating the chip during reballing may "repair" the bondings in some way.
B. Some of the the GPU pads on the card pcb have broken connection to the tracks leading to the memory chips - just between the pad shape and the beginning of the track. Reason might be low pcb quality, thermal tensions leading to material failure. Same can be with memory pads.
C. Same as B. but related to the pads on the GPU carrier pcb. Same can be with memory chip carrier pcb.
D. GPU silicon structure degraded and not fixable.
What do you think, is one of these theories correct and could it be possible to repair the card in such a strange case ?
Maybe I'll try to move this GPU to another card with same GPU chip model. This could give some more information to analyze where to look for the root cause. But need to have such a base board first.
I'm tending to theory A. as I have a case supporting this theory. I have a motherboard which does not start normally after pressing Power Button. However when I strongly push down the silicon of the chipset, the board normally starts and works. Have done twice a reballing of this chipset, and it did not help, so the root cause must be in the chipset itself, or the chipset pads on the board have some connection error with the track.
Let me know what is your experience and observations.
 
							
						
					I have an Nvidia card which makes me wondering, what is the root cause for Mats errors. The story is as follows:
1. When I received the card Mats showed one memory with errors.
2. Replaced the memory chip, but no change, still same error, so memory chip is ruled out.
3. Made a carefull reballing of the GPU, MATS showed 3 memories with errors, the one before, and 2 other new chips.
4. Made a second reballing of the GPU and Mats shows again only one memory chip with errors, the same on as in the beginning.
5. Made again a reballing, and again the same 3 memory chips show errors.
I Wonder what is the root cause for this unpredictible results. I have four theories:
A. The GPU is some way broken, internal bondings (connection between silicon structure and carrier pcb plate with pads made with very thin golden wire) not reliable or broken. Heating the chip during reballing may "repair" the bondings in some way.
B. Some of the the GPU pads on the card pcb have broken connection to the tracks leading to the memory chips - just between the pad shape and the beginning of the track. Reason might be low pcb quality, thermal tensions leading to material failure. Same can be with memory pads.
C. Same as B. but related to the pads on the GPU carrier pcb. Same can be with memory chip carrier pcb.
D. GPU silicon structure degraded and not fixable.
What do you think, is one of these theories correct and could it be possible to repair the card in such a strange case ?
Maybe I'll try to move this GPU to another card with same GPU chip model. This could give some more information to analyze where to look for the root cause. But need to have such a base board first.
I'm tending to theory A. as I have a case supporting this theory. I have a motherboard which does not start normally after pressing Power Button. However when I strongly push down the silicon of the chipset, the board normally starts and works. Have done twice a reballing of this chipset, and it did not help, so the root cause must be in the chipset itself, or the chipset pads on the board have some connection error with the track.
Let me know what is your experience and observations.